A programming language is a type of project which evaluates a program written in a programming language. Complexity of programming languages vary a lot. Some are integrated into operating system projects.

Often, a user is required to write the program in a text editor like TextEdit or NotePad, then save it as a *.txt file. It is then imported into a list in Scratch. This simplifies editing, as the Scratch List Editor is not very simple to use. Some languages allow programmers to "distribute" code on a forum topic by letting end-users install code by copying and pasting, or in a studio by downloading and importing projects. This allows for virtual devices which users can install multiple applications on.


A programming language implementation consists of a lexer, a parser and an evaluator. It often also has a code optimizer.


The lexer separates the program into substrings that represent different grammatical components, called "lexemes," which are classified as different "tokens." Tokens may be defined using regular expressions. For example: "a=5 + 3" could be separated into the tokens "a", "=", "5", "+", "3". This makes the code easier for the parser to read.


The parser is responsible for grouping the tokens into structures such as statements, loops, and function definitions. This grouping is done according to a set of rules known as a grammar.

Different parsing algorithms exist, such as LR(1), SLR(1), LALR(1), and LL(1). After parsing, the program is represented as a syntax tree.

The difficulty of creating a parser depends on the complexity of the grammar.


The tokens may be parsed according to a "context-free grammar." A context-free grammar consists of:

  • a set of tokens
  • a set of nonterminals
  • set of productions
  • a start symbol

Nonterminals are symbols and productions are their definitions. Productions consist of tokens and nonterminals. An example of a production is EXPRESSION;, which may one of many definitions of the nonterminal STATEMENT.

An example of a grammar is:




FACTOR : FACTOR * number
| FACTOR / number
| number

number : [0-9]*(\.[0-9]*)?

LR parsing

A LR parser is a type of bottom-up parser. The state machine of a LR parser records a stack of symbols. Based on the current input, it can either shift a token onto the stack or reduce the stack by popping off a recognized production and replacing it with the nonterminal it defines. For example, the tokens ["1", "+", "19", ";"] could result in these actions:

Action Stack
shift "1" [1]
reduce FACTOR : number [FACTOR]
shift "+" [TERM, +]
shift "19" [TERM, +, 19]
reduce FACTOR : number [TERM, +, FACTOR]
shift ";" [EXPRESSION, ;]


The program may be executed directly from the tree, or it may be translated into an intermediate representation. High-level code is defined by low-level instructions. For example, the code:
1 + 2 * 5

could be represented as these commands:
push 1
push 2
push 5

Useful Features

Jump/Label for flow control

Jump/Label is common in many low-level languages like BASIC, Batch, etc. It is based on "labeling" lines, then "jumping" the compiler to those lines based on commands. For example:

Label loop
Print "Hi!"
Jump loop

Here, the compiler will first label line 1 as "loop", then print "Hi!", then go back to "loop", where it will continue again (i.e. print "Hi!"). Thus, it is an infinite loop. Complex flows can be achieved with conditional jumping, where a jump happens only if a condition is fulfilled, for example:

Var myvar = 10
Label loopcheck
myvar = (myvar-1)
If myvar ~= 0 Then
Jump loopcheck

Structured control flow

Most commonly-used languages provide constructs for looping ('forever', 'repeat' and 'repeat until' in Scratch) and conditional branching ('if' and 'if else' in Scratch). These often work the same way as jumps, but the labels are named and added by the compiler, allowing code to be reused more easily.

Arguments for functions

Before a function is called, its arguments should be put in a place that the function can access. Exactly where this place is depends on the implementation of the language.

Named Variables

The more advanced languages provide named variables. A common implementation of named variables uses two lists: one for variable names and the other for variable values. A variable's value in the 'value' list is stored in the same index as its name in the 'name' list.

A more advanced implementation is one where the compiler substitutes the variable's name for its location when the code is compiled. This way, the program does not need to waste time finding the location of a variable's name.


If memory is a large list, pointers are the indexes of elements of the list. To use pointers properly, the language should provide the following capabilities:

  • getting the value of a pointer
  • setting the value of a pointer
  • getting the value pointed to by a pointer
  • setting the value pointed to by a pointer

Data types

Most languages support the Scratch 'variable' as the only primitive data type. However, it is possible to support additional user-defined data types (such as a 'point' containing variables x and y), and/or 'complex' data types (such as a pointer to another data type).

Kinds of Interactions

Interactions are the methods by which a program interacts with the user/programmer. There are usually two types of interactions: Terminal/Console, and IDE/Driver.

Terminal Interaction: CUI environment

Usually, a programming language will be run by a makeshift console consisting of a list display and an ask textbox. Commands like -h and -r are used to get help and run, respectively. Here, the programming language often provides a print function which adds an item to the list, thus "printing" it. An example is QuickSilver.

IDE and Driver Interaction: GUI environment

Another way to run some programs is the IDE and Driver Interaction method. You can create, edit, test, and debug programs in an IDE which is designed to help you write better code. Then you can save your program and others can install it on their own personal VM (driver software) by importing a project which contains the code. Often the VM acts like a simple OS allowing multiple "Apps" to be added and run all from a single database. IDE and Driver programming languages often do not require a console or terminal to be run, they use vector graphics for their output via the Pen. An example of this is Skip.


See Also

Cookies help us deliver our services. By using our services, you agree to our use of cookies.