Introduction
General information about Xreate

Xreate is an open source general purpose high level programming language designed to write efficient and safe computer programs.

Here "high level" refers to the developer oriented side meaning exactly an ability to easily write, read, reuse, as well as adapt software to a constantly changing environment or business goals. In this respect, any software product can be evaluated on the basis of three dimensions: efficiency, safety, and flexibility. Unfortunately, those properties are proved to be largely contradictory, for it is manageable to write either efficient (yet unsafe) or safe (yet impractical) code, but not both. Thus, the ultimate goal of the language is to allow developers to produce code that would have all these properties at the same time. Blending features of seemingly incompatible programming paradigms is a basis of Xreate's design principles.

To achieve the aforementioned design goals, Xreate consists of three distinctive layers:

  • Brute. The lowest layer is called Brute— this is code that is intended to be actually compiled. Code on this level implements actual software functionality. It resembles the usual imperative languages' apparatus and consists of executable instructions such as arithmetic, branching, input / output, etc.
  • Transcend. Brute alone is not enough to constitute a full-fledged language since code requires various non-executable metadata to express developer's intents, check correctness, validity and perform other types of analyses. In Xreate everything of this sort belongs to a declarative type layer called Transcend. Transcend is a logic reasoner that is appropriate to do management-type work — it analyzes, oversees and controls Brute by guiding compilation process. More precisely, everything on this level, logic or transcend facts and rules, is gathered and sent to an external logic solver to make solutions that are brought back in order to guide compilation. Unlike usual static analysis tools, Transcend directly controls compilation(see Basic Example) and able to make decisions even based on data available only at runtime(see Late Transcend)
  • Interpretation. There is also Interpretation — the intermediate level resembling dynamically typed languages that is used as a contact point and interpreter between Brute and Transcend. See an example.

On a syntactic level, Xreate is a procedural language with extensive use of annotations — arbitrary unconstrained metadata that a software developer can attach to different language constructs, variables and code blocks. Annotations are completely invisible for the compiler proper and used by Transcend more as a suggestion conveying additional information.

There are several extensions already implemented to give a feeling what does this structure can be used for. Containers chapter describes that it is possible to reason about and automatically choose the most appropriate data structure's implementation depending on how it is used in the code. Look at the example below:

x = [1, 2, 3]:: [int].

Container x does not have well defined implementation just yet. Only by looking how it is used throughout the code, the compiler is able to decide how exactly to store container's data.

Interaction of different components and joint use of external resources is covered by Exploitation:

logger = createFileLogger("/some/file"):: Logger. 
...

write(logger, "First log entry").
...

write(logger, "Last log entry").

Exploitation reasoning allows to determine when it is the first, last access to resources such as files, in other words, it infers access order. As a result it is possible to automatically initialize / destruct related resources. Unlike RAII, an another related technique, Exploitation is reserved to manage resources usage that spans across different parts of a program: modules, plugins, etc.

Virtualization reasoning also helps to work with external resources by enabling access control if and when it is needed only. Example:

openFile("/some/file"):: string; assign_sizo(zoneA).
openFile("/some/file"):: string; assign_sizo(zoneB).

If the compiler recognizes file access from the different zones, as in this example, it applies an appropriate virtualization strategy enough to ensure that instructions that belong to different zones do not interfere with each other.

Unlike "pure", academic languages, Xreate targets safe and reliable usage of effectful computations such as IO that is covered above as well as mutable structures described in the Communication chapter.

Note, that the described extensions are not part of the compiler and developers can write their own custom transcend rules to cover other aspects.

Basic Example

To demonstrate what Xreate is all about, basic example is given below:

tests/introduction.cpp: Introduction.Doc_Example_1
guard::                                  iAmVeryFast 
{
  div = function(a:: int, b:: int):: int
  {
    a / b
  } 
}

guard::                                  iAmVerySafe 
{
  div = function(a:: int, b:: int):: int
  {
    if ( b == 0 ):: int { zeroDivisionErrCode() } else { a / b }
  } 
}

test = function::                        int; entry; iAmVerySecure 
{
  div(10, 5)
}

Here entry point of the program is a function test recognized so by the compiler because of annotation entry in its signature. There are also two functions with the same name div called specializations. Each specialization has a guard that defines a condition that has to be met in order to invoke this particular specialization. In the example, specializations of div have iAmVeryFast and iAmVerySafe guards, respectively. Let's say that a code author writes two specializations where the first one is a very fast division implementation, while the second one is a very safe division implementation since it checks division by zero, being "unacceptably slow" due to an extra check instruction, though. This is a basis of polymorphism — client's code test is able to work with any specialization, and the compiler must decide which one to invoke with the only hint it has — annotation iAmVerySecure in the function test's signature.

NOTE: All annotations (except entry) are custom defined by developer itself.

This is when Transcend comes into play. By adding a transcend rule as shown below it is possible to associate annotation iAmVerySecure with invocation of specialization guarded by iAmVerySafe:

tests/introduction.cpp: Introduction.Doc_Example_1
dfa_callguard(SiteInv, iAmVerySafe):-
  dfa_callfn(SiteInv, "div");
  SiteInv = s(_, _, ScopeInv);
  cfa_parent(ScopeInv, function(FnInv));
  bind_func(FnInv, iAmVerySecure).

Transcend rules are written in ASP syntax — common syntax to write logic programs. This particular rule reads that for any function annotated with iAmVerySecure, certain specialization iAmVerySafe is chosen for div invocation.

NOTE: In this example an appropriate specialization is statically resolved, so the other specialization isn't even compiled.

By providing custom rules it is possible to implement any polymorphism strategy, be it performed statically or dynamically. The example demonstrates basic workflow: Transcend gathers available information about a program such as annotations and using custom rules makes decisions to guide various aspects of compilation process, particularly by selecting appropriate specializations as in the above example.

More Advanced Example

Suppose we write a program to generate a web page consisting of several blocks(e.g. a header, a footer) constructed independently by different parts of our program. In order to organise the code, we express our intention that all blocks should be sent to a client in a very specific order: first a header, then a body and footer, as below:

tests/exploitation.cpp: Exploitation.Doc_ExampleEov_1
eov_expect(
  webpage, header, body; %we expect body to be sent after header 
  webpage, body, footer  %.. and footer after body
).

This is similar to Exploitation: we are working with the external resource webpage and want to take into account an exact order of resource exploiting operations. Then, we write code like this:

send("Header"):: Content; eov_checkpoint(webpage, header).

We mark the operations we are interesting in as checkpoints(here header is the name of a checkpoint and webpage is the resource the checkpoint refers to) and want to know are checkpoints executed in the expected(as defined above) order.

If it so happens that these blocks are constructed in the correct order in our program we may send them immediately. Otherwise, we must cache already constructed content till a whole page is generated to ensure correctness. In other words, clearly, there is an opportunity for optimizations for caching has not only memory overhead but delays response latency(time before the first block is sent) as well. We write two implementations immediate_output and cached_output each for the corresponding case:

tests/exploitation.cpp: Exploitation.Doc_ExampleEov_1
//type to store either sending error code or cached content
Content = type variant {
  errcode(data::int),
  message(data::string)
}.

//Send immediately:
guard:: immediate_output
{
  send = function(content:: string):: Content 
  {
    errcode(printf("%s", content))
  }
}

//Cache content to send it later:
guard:: cached_output
{
  send = function(content:: string):: Content 
  {
    message(content)
  }
}

These implementations should be registered for the compiler to know which one to use:

tests/exploitation.cpp: Exploitation.Doc_ExampleEov_1
eov_analysis(
  success, immediate_output;
  fail, cached_output
).

Predicate eov_analysis compares actual checkpoints execution order with the expected one and chooses either of two implementations depending on the result. This way, it is guarantied that immediate sending is chosen only if it is safe to do so to avoid unnecessary caching.

Thus we can safely change and adapt our program knowing that clients always receive web page's content in the correct order by automatically employing the most efficient content delivery strategy depending on particular circumstances.

Differences from other languages

it is convenient to talk about intentions in order to outline a vast landscape of programming languages and point out the place of Xreate on it, i.e. to compare languages depending on do they allow to express a developer's intentions along with raw code that can be used on many occasions, e.g. to validate the code's correctness. Traditionally type system is used to declare intentions. At closer look, types in (imperative) statically typed languages contain inseparable combination of the following:

  • Intentions, such as "that variable is a string".
  • Usage patterns: a new code should play nicely with the rest of a program, e.g. if a program works with unicode, a new string variable should be also of unicode family type(even if this is not necessary for a given part of code)
  • Platform constraints, e.g. if a platform supports wide strings natively, a new string variable should also be a wide string.

In this regard, in general, statically typed languages are overconstrained, they require developers to provide more information than they intend to. This usually hinders code reuse and adaptation; to work on a new platform, software requires porting: a process of re-expressing underlying intentions once again with the new platform's constraints.

On the other side, dynamically typed languages are underconstrained, since they do not allow to express all desirable intentions. This leads to disastrous inefficiency and code fragility because any errors can be caught only at runtime.

As an example, OOP languages are hugely successful among other reasons because they provide classes to more finely express intentions, e.g. String, UnicodeString, Utf8String with exact behaviour being hidden in implementation details.

Xreate in its turn is based on the idea to allow developers express as much or as little intentions as they want to. This is a way to reach one of the the language's goals to facilitate writing of easily reusable and adaptable software. On a syntax level, annotations are used to express intentions, e.g. string;i_dont_need(utf8)(note the modality, annotations can express boundaries, preferences, etc. ).

This approach obviously leads to problems with software's performance and efficiency. For most high level languages, compilers rely on defaults to fill in unspecified by the developer gaps in the exact implementation details. Usually languages provide standard, default data structures the developer have no control over. However Xreate follows another approach: to resolve this contradiction, the compiler determines necessary implementation details not only depending on annotations, but also by looking at usage patterns along with platform properties trying to infer most efficient implementations in any given circumstances.