Roadmap
Updated 1,830 Days AgoPublic
Actions

Version 3 of 10: You are viewing an older version of this document, as it appeared on Apr 22 2019, 4:21 PM.

Fundamentals

Xreate is created to solve the problem(to allow easily and conveniently write an efficient and safe code) offering the solution(by combining largely opposite declarative and imperative approaches to cancel mutual limitations, using annotations to link them together). That's the unavoidable foundation, everything else is a subject of discussions.

Implemented, but not documented yet

1. Versions

Versions is a language construct to deal with mutable data. A short example:

Example 1

x{0} = {year = 1953, month = "March"}:: Date.
x{1} = x{0} + {month = "February"} ::   Date. //updates a single structure's field

x[0], x{1} are different versions of the same variable. This is a simple trick with an idea that for all intents and purposes x{0}, x{1} behave like different variables but really aren't. All analyses treat them as different immutable variables and everything looks good. However, the compiler uses the same memory's address for them making this an update of a variable. It is a hint from a developer that the only one version of a variable should be available at any given time. The only analysis that knows the truth is Versions Analysis. Its task's to make sure that there is no expression that uses outdated/unknown/unreachable variable's version. An another (counter)example:

Example 2

x{0} = 8:: int. 
x{1} = x{0} + 10:: int. 
y = x{0} + x{1} :: int.

Versions Analysis builds a variables's liveliness graph to track situations when it's impossible to compute an expression due to the fact that it refers to an (possibly) outdated version or if it has lost a track of the current version due to branching, etc(Example 2).

NOTE: Versions feature is in a very prototype state as of now and requires additional work.

Critical Path: Most important, but still not implemented aspects

Pointers. Memory management

As of now, Xreate does not have pointers or overt memory management. Some directions:

Memory allocation. Automatically(with manual hints via annotations) try to recognize and decide which variables where to allocate:
- Stack.
- Fixed size memory segment with possibly predefined addresses to allocate required memory really fast.
- Heap, for the cases where the reasoning can't offer anything better.
Garbage collection. Automatically(with manual hints via annotations) try to recognize and decide which variables how to collect:
- Link life time of one variable a to another one b. Once b is collected, collect also a.
- Scope based lifetime. Note, that it's not necessary the scope where the variable in question is defined. It's easy to detect that scope/function is invoked in a loop, so choosing some other scope where it's explicitly allowed to take a moment to dispose of all the built up garbage would be better.
- References counting strategy, if no other strategy's applicable.

Mutable data

Mutable data covered by Communication. Communication checks order in which mutable variables are read or written to detect value loss and other inconsistencies. Communication is still very briefly sketched prototype. In essence, pointers, variables' versions(described above) and Communication are crucial building blocks for the language.

Containers and strings

Once pointers are done, it's possible to focus on Containers. Containers is an extension to automatically choose the most appropriate implementation out of a pool of existing ones, depending on the usage. The same goes for strings. For example, encountering foreign C function strcmp(a, b) the Containers reasoning should suggest C-type string as an implementation for a, b variables. On the other hand, if it encounters internal function StrLen(a) which requires(via an annotation) implementation to have already stored string size, it should infer something like a Pascal-type string(with its length stored in the first byte). As of now Containers reasoning handles which structures to make eager and which lazy ones. Basically this covers a whole range of different languages' constructs, techniques and approaches related to strings, laziness, generators, etc.

Exceptions

Not implemented feature yet. More on this later.

Not implemented features yet. Do not belong to the Critical Path

Type system and type inference.

Modern languages's type system is somewhat convoluted in order to take over some first/high order logic features. For example, phantom types are used to check static correctness of using validated/unvalidated data. In the case of Xreate this situation has to be rectified in order to draw border between what is responsibility of the hard-coded type system and what belongs to Transcend proper. For a start, Transcend in case of inability to prove correctness statically is able to assign special strategy to check properties in question dynamically or make some adjustments to minimize possible damage if program is really incorrect. Second, Transcend able to work with preferences, choosing the most appropriate solution out of the number of possible candidates. Anyway, as of now basic ADT type system is implemented.

Numbers and automatic conversion.

The idea behind numerical types is to work on different levels of detailing:

- iN e.g. i32 denotes fully defined size. This automatically means the highest level of compiler warnings regarding conversions, overflows, etc.
- int reserved as a placeholder for the compiler to come up with appropriate integral type staying reasonably efficient. As of now hard aliased to i32.
- num reserved as a placeholder for the compiler to come up with appropriate integral or float type if a developer can't assign anything better. As of now hard aliased to i32.
Contracts and Diagnostics.

Transcend is really easy can be used to statically check software contracts and signal warnings straight to the point. An example:

//harmful function that has multiple bugs 
getNumber = function:: int; bug(4379); bug(12990) 
{ 0 }

client = function:: int; contract(call_only_good_functions)
{ getNumber() }

Transcend rules to support contract call_only_good_functions:

bind_func(X, bad_function):- bind_func(X, bug(_)). //marks bugged function as bad_function

warning("Contract violation: call_only_good_functions") :-  //looks for the contract violations
  cfa_call(Scope, BadFn);   
  fn_scope(GoodFn, Scope);
  bind_func(BadFn, bad_function);
  bind_func(GoodFn, contract(call_only_good_functions)).

This excerpt prohibits functions annotated contract(call_only_good_functions) to call functions with identified bugs.

Supervision.

Similar to Contract and Diagnostics described above but on a level of environment where a code is compiled or executed. Defines contract between an environment(OS, machine, local network) and the compiled or executed software, e.g. ensures that software can't use web camera or open local files. In case of already compiled software it has to have trusted compiler issued certificate that it's indeed, let's say, does not open any local files and certificate is examined to comply with environment rules instead of actual source code. This don't require any special support from underlying OS and does not require sandboxing and related performance degradation.

Why subsytem
Time machine

Language's drawbacks and disadvantages

Peculiarities that need special care to conceal and mitigate

Indefinite compilation time. Clear trade off execution efficiency against compilation time. Compilation time is cheaper in a sense that software is compiled once on a developer's machine but run multiple time on multiple users' machines. However, Transcend being just designation of a SAT logic solver backend has the same inefficiency as always. Thus usual AI measures apply.
Everything is rewritten, reordered, transformed, adapted and silently decided. Whole lot of different things done under the hood. Great amount of added complexity.

Language's distinctive possible applications and selling points

Convergent and spread out software.
Combinatorial libraries. Software factories.
Low-end hardware. IoT.
Software that can not be written by professional developers.Scientific software and massive computations.

License

The compiler is under Mozilla's MPL. This means that the code can be combined with a GPL'ed code as well as with a proprietary code(under condition that MPL'ed and proprietary parts reside in different files). The intention is to make the compiler clearly FOSS project with an ability to use it with proprietary plugins, extensions, IDE, etc as a very remote but possible option. To make this possible, the contributors are free to assign their own licenses for their respective parts as long as license in question is at least as permissive as MPL(which is expected) or MIT or something of this sort. Anyway, not a single stroke will be used without a fully informed contributor's consent.

GPL license: due to the reasons above, a source code of the core components may not be GPL'ed. However there are still legitimate uses for GPL(or other restrictive licenses). The compiler has two distinctive backends(or rather three):

Brute byte code generation backend. Class LLVMLayer (and all the files in compilation folder) - are responsible for producing a byte code (currently LLVM). If someone adds an interface to a different backend , notably GCC, it surely may have a different, respective backend's compatible license.
A logic solver backend. Class TranscendLayer(and files in analysis folder) - compile logic programs for an external solver(currently Clasp) to process and send a solution back. There are many different solvers out there, so the same considerations apply.
A foreign interface backend - an interface to different languages. Currently it's clang to use external C functions in Xreate. Different languages' bridges may require different licenses.