Skip to content
Wei Xia edited this page Jul 7, 2015 · 3 revisions

Semantics Conversion

Classes and Structs

For declarations, classes are compiled to classes, and structs are compiled to structs.

For instances, classes are wrapped with a shared pointer (sys::sptr, shorted to _). For example, string s = "123" will be compiled to _<string> s = "123" because string is a reference type (class).

Reference type instances are all on heap. Value type instances are all on stack. It's same as C#. With escape analysis implemented in the future, some reference type instances will be put on stack too.

Interfaces

Interfaces are compiled to abstract classes with pure virtual methods. It's following the common practice in C++.

Memory Management

GC Algorithm

It's almost same as the algorithm used in Apple's Obj-C or Swift, it uses reference counters to manage memory. It's different from C#, which uses a "mark-sweep" like algorithm. Both algorithms have their advantages and disadvantages.

In general, a "mark-sweep" GC usually occupies much more memory, sometimes freezes the program when it's collecting garbage, but provides better overall throughput. So it's generally more suitable for servers. While a reference counter based GC usually takes much lesser memory, and the program is running more smoothly and predictable. So it's generally more suitable for clients, especially for portable devices like phones.

How about system programming and game programming? The best practice is usually implement performance critical components in pure native C, like process scheduler in a OS kernel, or graphics engine in a 3D game, but implement everything else in a GC enabled language. For example, the Unreal game engine is implemented in C/C++, but it also offers a scripting language called Unrealscript which is GC enabled to help developers to implement game scenarios etc.

Due to the fact that CS++ is essentially a C# syntax of C++, it fits this design even better because

  1. It's eventually compiled to machine code, no script interpretor at runtime
  2. No data marshaling overhead when interacting with native C/C++ components
  3. It still offers great development productivity

GC Overhead

In terms of space, the overhead is 8 bytes for each object on heap (4 bytes for strong reference counter, 4 bytes for weak reference counter, same for both 32bit and 64bit programs). There is no space overhead on stack. In terms of time, the overhead is reference counter operations when objects are created, copied, and deleted.

In contrast, the implementation in C++ standard library (std::shared_ptr) needs additional heap allocations (the memory for reference counter is allocated separately) and twice space on stack (the pointer to the object, and the pointer the to reference counter) for each object.

Reference Cycles

We use sys:wptr template class to break reference cycles. In C#, you only need to apply [WeakRef] attribute to the class field to mark it as a weak reference. E.g. class Node { [WeakRef] ParentNode Parent; }

This problem only exists on reference types (classes) through.

Reference Cycles Detection

  1. This problem should be best avoided by system design
  2. We will implement some kinds of detection at compile time to detect most obvious reference cycles
  3. At runtime, most C++ memory leak detection tools can be used to detect this problem
  4. Use C# IDisposable design pattern to release resources (e.g. file handles) instead of destructors (which is recommend by C++ but not here). Thus even when memory leak occurs, the resources are still released properly. Yes we support C# using blocks

Further Optimization (not implemented yet)

Escape Analysis (to put more objects on stack)