SyntaxXreate syntax in detail
There is a number of principles the Xreate syntax is based on:
- Follows SSA(single assignment) form: each identifier is defined only once and no redefinitions allowed.
- Follows literate programming principles where possible. For example, identifiers can be defined in any order reflecting personal preferences and convenience for a developer. Regardless of the definition order, expressions are computed based on dependencies between them. Literate programming principles from a practical point of view simplify code testing, support, catching regressions, etc.
Literals and Expressions
Xreate expressions have a form:
- annotation-list is a list of annotations delimited by semicolon.
Expressions consist of literals and various operations as follows:
Literals | numbers, strings: 5, "Nimefurahi kukujua". |
Lists, records | Record is a collection of elements of different types - {year = 1934, month = "april"}. List is a collection of elements of the same type without keys - {16, 8, 3}. Range is a specific form of a list - [1..18]. |
Arithmetic operations | Basic arithmetic operations: +, -, *, /. |
Boolean operations | Negation example: -isEmty(). |
Relations | ==, !=, <>, <, <=, >, >=. Both !=, <> mean not equal relation. Examples: 8>=3, "Blue" <> "Green". |
List and record operations | The index operation to access individual elements of a list or record. Example: colors = {"Green", "Blue"}::[string]. color = colors[0]:: string. Accesing a record's element: date = {year = 1934, month = "april"}. year = date["year"]. |
Identifiers | Example: a - b |
Functions | Example: result = isOdd(6):: bool. |
Annotations
This chapter is about Brute syntax. See Transcend for details regarding Transcend and annotations syntax.
Code Blocks
Code block is a list of expressions delimited by a period. It has a body - the main expression along with optional identifiers' definitions.
Code blocks consist of a body expression and an optional set of assignments to define identifiers used in the body expression. A code block's computation result is defined as a result of computing its body expression. Identifiers are computed before expressions they are appear in, regardless of a definitions' order.
test = function:: int; entry { a + b:: int a = 10:: int. b = 2:: int. }
Above is an example of the code block with a+b as its body expression(because it does not have a form of an assignment). In this case the body depends on the identifiers a, b so the compiler computes both of them beforehand.
A computation order depends only on dependencies between expressions. This approach has properties as follows:
- Mutually independent identifiers can be evaluated in any order.
- An identifier computed only if it is required(at least transitively) by a code block's body expression.
Functions
- function-name Name of a function.
- argument Formal parameters. Arguments are delimited by comma.
- type, return-type Formal parameter's and returning value's types.
- function-block Code block that acts as a function's definition.
- annotations List of annotations delimited by semicolon.
Below is an example of the function sum returning sum of its two arguments. Moreover there are several annotations defined. First annotation entry has a special meaning — it depicts an entry point or a main function in a program. Second annotation status(needs_review) is a demonstration that developers can annotate functions using custom annotations to express related intentions or properties.
sum = function(x:: int, y:: int):: int; entry; status(needs_review) { x + y }
Function Specializations
- annotation Guard expressed in the form of an annotation.
- functions-list One or more functions that share the same guard.
Specializations is a crucial Xreate concept serving as the principal connection between Transcend and Brute levels. Xreate allows several functions to share the same name; in this case they are called specializations. This is a syntactic foundation for function level polymorphism, i.e. ability for the compiler to decide which exactly function is called out of several available options. The polymorphism resolution can happen during compilation(early polymorphism) or at run-time(late polymorphism).
Functions with the same name, i.e. different specializations must have identifiers called guards to uniquely define a specialization. In this sense, a shared name is just a placeholder, which only along with a guard comprise the fully qualified exact function identifier. On the Brute level it is possible to specify only a function's shared name for invocation. On the other hand, Transcend is responsible to supply a guard part. When a function is actually called by its name in a program it's expected that the resolution is already done by Transcend at some time earlier and it supplies the correct guard to uniquely specify which exactly specialization to call. This gives Transcend a deep control over an actual program behaviour.
An example:
guard:: safe_enviroment { sum = function (a::int, b::int):: int { result = a + b:: int. if (-isLastOpOverflow(result)):: int { result } else { overflowErrorCode() } } } guard:: fast_enviroment { sum = function (a::int, b::int) :: int { a + b } }
To alter existing code behaviour it's always possible to add new specializations and adjust Transcend rules to specify situations when a new specialization should(or should not) be used.
Branch Statements
IF Statement
The if statement executes block-true or block-false depending on the condition evaluation's result.
Example:
answer = if (question == "Favorite color?"):: string {"Yellow"} else {"Don't know"}.
SWITCH Statement
- condition Expression to to decide which branch to execute next.
- guard Value to match against condition.
- default-code-block Executed if no appropriate case found.
The switch statement evaluation's result is that of the branch whose guard matches the condition.
An example:
monthName = switch(monthNum) :: string case (1) {"Jan"} case (2) {"Feb"} case default {"It's strange..an unexpected month"}.
Loops
Xreate loops are constructed in such a way that they hide actually mutable operations semantic under an immutable facade compatible with a single assignment form.
LOOP Statement
- init-value Initial value a loop starts from.
- accumulator Identifier which holds loop's result after each iteration.
For each iteration accumulator assumes the result of a previous iteration or init-value for the first iteration. The result of loop-body evaluation is used as a accumulator's next iteration value and as an overall loop statement result after the last iteration.
Note, that this notation does not have an explicit termination condition! The compiler relies on the loop body's fixed point in order to decide when to interrupt the loop. Consider an example:
//an infinite loop answer = loop (2 -> x) :: int { if(IsPerfect(x)):: int {x} else {x+1} }.
The example tests numbers for being perfect(sum of all proper divisors equals to the number itself). During iterations the accumulator x assumes values as follows: 2, 3, 4, 5, 6, 6... After the first perfect number is found, no further iteration will change the result anymore since there is no increment, so the loop continues to go through the same number again and again, making this an infinite loop. Obviously, x=6(the first perfect number) is a fixed point in this example. It does not make any sense to continue going through further iterations once a fixed point is reached because the result is not going to be changed anymore, thus the loop can be safely interrupted at this point.
The compiler relies on manually provided annotations to recognize when a fixed point is reached. There is a special annotation final reserved to specify a fixed point for loops. Once an expression that marked as final gets evaluated it's assumed that a fixed point is reached or in words the compiler knows it's the very last iteration after which loop can be terminated safely. The correct code for the example above is:
//a loop exits after the first perfect number is found answer2 = loop (2->x) :: int { if(IsPerfect(x))::int {x:: int; final} else {x+1} }.
In this case the compiler is able to recognize that a fixed point is reached in order to know when it is safe to terminate the loop. In the example, the final result answer2 is 6.
LOOP FOLD Statement
- list Container to iterate over.
- element Identifier that assumes value of a currently processing list element.
- type, annotations Expression types and optional annotations delimited by semicolon.
- init-value Accumulator's initial value loop starts from.
- accumulator Identifier that assumes loop-body evaluation result after each iteration.
The loop fold statement is a commonly used particular instance of loop to Iterate over list in order to accumulate the result by applying the loop-body transformation to each element and an intermediate accumulator. The result of a current iteration is used as the accumulator value for a next iteration. Accordingly, the overall loop value equals that of accumulator after the last iteration. If a fixed point is found evaluation terminates earlier.
Example shows a code excerpt that looks for the minimal element in a given list(and less then the initial value 10):
numbers = {4, 8, 7, 1, 5}:: [int]. min = loop fold(numbers->x:: int, 10->acc):: int { if (acc > x):: int {x} else {acc} }.
LOOP MAP Statement
- list Container to iterate over.
- element Identifier that assumes value of a currently processed list element.
- type, annotations Type and optional annotations delimited by semicolon.
The loop fold statement is a commonly used particular instance of loop to Iterate over list and applying the loop-body transformation to each element. The result is a list that consists of all the transformed elements.
An example below demonstrates creating the even_number list by multiplying by 2 every element of odd_numbers:
odd_numbers = {1, 3, 5}:: [int]. even_numbers = loop map(odd_numbers -> number:: int) :: [int] { 2 * number }.
Types
Primitive Types:
bool | Booleans. |
i8, i32, i64 | Signed integers; 8, 32, 64 bit wide respectively. |
int, num | Currently i32 aliases. Reserved as placeholders for an auto detected appropriate integral type and for auto detected appropriate either integral of floating-point type, respectively. |
float | Double precision floating-point numbers. |
string | Currently null terminated ANSI char string. Reserved to be generic type with no particular implementation. A concrete implementation is to be determined similarly to the containers approach. |
* | An unspecified type. Postpones type checks for this expression. Examples: x = {amount=200, currency="USD"}::*. |
Compound types:
[ element-type ] | List of elements of the same type element-type. Examples: x = {1, 2, 3}:: [int] - list of int's. Lists can have different internal implementations. |
{ key:: type, ... } | Record: a list of elements of different types possibly with named keys. Examples: {int, string}, {name::string, age::int}. |
variant {option :: {type, ...}, ...} | ADT type. Variables of this type can hold value of any type out of a list of permitted ones. Examples: variant {FullAddress:: {string, string, string}, ShortAddress:: {string}}. |
slave identifier | Denotes a type constructed by Transcend. See slave types. An example: slave unit_test. |
Type operations:
type [ key ] | Index operation: accessing elements of a compound type. Examples: Bio = type {birth_year:: int, name:: string}. YearBirth = type Bio[birth_year]. |
type ( parameters... ) | Constructs a concrete type with the given parameters. Examples: MyTree = type Tree(int). |
New types are defined as follows:
Examples:
Tuple = type {string, int}. Int = type Tuple[1]. //accessing by index List = type(X) [X]. // List of elements of type X. IntList = type List(int). // type function to construct List of ints.
Slave Types
- predicate Name of a logic predicate
Slave type is a reference to a type defined on the Transcend side. This gives Transcend a full control over program types marked as slave ones. The type is constructed in such a way that variables of this type are able to hold predicate's arguments. Type inference works as follows:
- If a predicate has only one argument then a constructed type is a type of this argument: int, string, variant or tuple.
- A constructed type is a record in case of several arguments.
- Predicates correspond to variants in a constructed type.
An example; Transcend facts:
person("John", 1962). person("Bill", 1961).
The Brute side:
PersonNative = type {string, int}. Person = type slave person.
In the example above the types PersonNative and Person are equivalent.
Variants
Sometimes it is useful for a variable to have an ability to hold values of different types depending on some conditions, in other words to have a variant type. An example:
Color = type variant { White, Black, Magenta, CustomColor(r:: int, g:: int, b:: int) }. draw = function:: int { clrBorder = Black():: Color. clrBackground = CustomColor(50, 50, 50):: Color. drawRectangle({0, 0, 100, 100}, clrBorder, clrBackground) }
SWITCH VARIANT Statement
- condition Expression of a variant type.
- alias Identifier to denote unwrapped content of the condition expression withing case branches.
- guard Name of a variant to match against actual condition's variant.
- case-branch Code block to execute in case of the matched variant. The condition expression's content is referred to by alias within the branch.
Variant variables require special means to test which exactly variant they contain at any given moment as well as to access it. Usually, languages that support variant types(ADT) solve this problem be means of pattern matching. Xreate intentionally does not support pattern matching since it is depends on parameters order, which is plainly unacceptable; besides, it's hardly usable in case of a large amount of parameters. Instead, Xreate supports special syntax to unwrap a variable's content using switch variant statement.
An example:
Month = type variant { MonByName (name:: string), MonById (id:: int) }. nextMonth = function(month:: Month):: Month { switch variant(month):: Month case (MonByName) { monthName = month["name"]:: string. //here month has {name:: string} type MonByName(nextMonthByName(monthName)) } case (MonById) { monthId = month["id"]:: int. //here month has {id:: int} type if(monthId == 11):: Month { MonById(0) } else {MonById(monthId + 1)} } }
The function nextMonth computes the next month after a given one. The parameter month can't be directly accessed due to being of a variant type; hence, It should be unwrapped before using. As seen in this example, Xreate silently defines a new variable with the same name month which holds an unwrapped content for each switch variant's branch independently.
Versions
- version Number to specify the identifier's version.
Versions is a language construct to deal with mutable data. An example:
Date = type {year:: int, month:: string}. test = function:: Date; entry { x{0} = {year = 1953, month = "March"}:: Date. x{1} = x{0} + {month = "February"}:: Date. //updates a single record's field x{1} //returned value }
In the example above x{0}, x{1} are different versions of the same variable. This is a simple trick with the idea that for all intents and purposes x{0}, x{1} behave like different variables but really aren't. All analyses treat them as different immutable variables, yet the compiler actually uses the same memory address for them making this an update of a variable. It is a hint from a developer that the only one version of a variable should be available at any given time. The only analysis that knows the truth is the versions analysis. It is responsible for code validation in order to make sure that there is no expression that uses an outdated/unknown/unreachable variable's version. An another (counter)example:
x{0} = 8:: int. x{1} = x{0} + 10:: int. y = x{0} + x{1} :: int. //invalid code: uses several versions
The versions analysis builds a variables' liveliness graph to track versions usage and reveal situations when it's impossible to compute an expression due to the fact that it refers to an (possibly) outdated or unreachable version.
Records
Record's elements(fields) are denoted by strings in order to access the field's value. This gives a possibility to use variables to reference fields. Such references are statically resolved during Interpretation. An example:
Employee = type { name :: string, surname :: string, signature:: string }. test = function:: string; entry { employee = getAnyEmployee():: Employee. primaryKey = "surname":: string. employee[primaryKey] }
In Xreate the left side of any assignment is always an identifier, hence there is special syntax to update one(or more) record's fields. An example:
Day = type { year:: int, month:: string, day:: int }. test = function:: Day { tomorrow today = {year = 1936, month = "July", day = 21}:: Day. tomorrow = today + {day = 22}:: Day. }