The V8 assembly appeared to me in dreams.

It appeared to me and said:

“I'm the owner of your stack.”

If there is something that V8 knows how to do is to generate efficient assembly to execute low level instructions in the most optimized possible way.

But…

How do you generate the V8 instructions?

The first thing would be to understand what I call the entry point.

The entry point is the place where a literal expression of Javascript is evaluated by V8.

This may change depending on the environment in which we are running V8.

• Node.js: Although in Node.js there are different points of entry, the most used would be:

o REPL mode.

o Reading a file.

• V8: In shell d8, which you can obtain when compiling V8, the entry points would be the same as node.js mentioned above, REPL or by file on disk.

• In blink the entry point of a javascript code is found in the bindings section of V8, and it’s nothing more than a v8::Handle type that can come from the web page itself, from an XHR request, etc…

That been said, we continue with the explanation of how it generates the V8 instructions, but first of all, a curiosity about the entry point of Node.js

Node.js Curiosity (or trolling)

Open the node shell:

$ node

Now copy and paste this literal expression of a declaration of a function:

function nogg(){ return "Aholic"; }

Once this is done, copy and paste this line exactly, and to finish press enter:

nogg)(

Continue your life as if nothing had happened.

We already know the point of entry

And now what? … Well, at this point that my colleague AST comes into play.

“You’re all I ever wanted…”

An abstract syntax tree (AST) is an arboreal representation of source code ready to be compiled into native code.

Each node of the tree is saved independently of the entire tree, which allows a quick management of allocation/de-allocation of all nodes of the tree. This means that, once a tree node has been created, it can be compiled and removed from the tree regardless of whether the complete literal syntax of the AST in question has been fully evaluated/compiled.


“Low-level programming is good for the soul of the programmer”

John Carmack

Here are some examples of nodes that can exist in an AST, this will give us a context of how an AST is internally composed and what each node contains:

  • ExpressionStatement
  • IfStatement
  • ContinueStatement
  • ReturnStatement
  • DoWhileStatement
  • ForInStatement
  • DebuggerStatement

That said, we are going to investigate the AST generated by the following code:

function dev_name () {
     return "Carlos Hernández Gómez";
}
dev_name ();

The tree generated for that code would be the following, take a look to the comment blocks:

FUNC  
. NAME ""
. INFERRED NAME ""
. DECLS /*** DeclarationContext, hablaremos de él en otro artículo, pero básicamente es un contador de referencias para un V8:Context ***/
. . FUNCTION "dev_name" = function dev_name
. EXPRESSION STATEMENT
. . ASSIGN
. . . VAR PROXY local[0] (mode = TEMPORARY) ".result"
. . . CALL
. . . . VAR PROXY (mode = VAR) "dev_name"
. RETURN
. . VAR PROXY local[0] (mode = TEMPORARY) ".result"
/* Código de la función, muy sencillo y liviano puesto que la función simplemente hace un return de un literal string */
FUNC  
. NAME "dev_name"
. INFERRED NAME ""
. RETURN
. . LITERAL "Carlos Hernández Gómez"

/*** Una vez ejecutada, la función devuelve: ***/
"Carlos Hernández Gómez"  

In the generated AST, we can observe a first node that represents the declaration of the function, and the second node contains the AST corresponding to the execution of the function, they are therefore different nodes that hang from the same tree but also generated by V8.

Regarding the AST there are several interesting things to know, the first thing is that a literal expression of javascript converted to an AST will not be generated later even if we declare the same expression again, since V8 maintains a cache of statements to be faster.

If for some reason we wanted a deeper analysis of the AST generated by V8, we could create our own class and have it extended from AstVisitor, this last class is in charge of visiting the sentences of a literal expression in search of variable declarations, declarations of statements, declarations of expressions, etc…

Well, we already know what happens when V8 starts generating the valid code to be executed.

What is the next step?

Generate instructions

V8 uses two different compilers, the base compiler –also called full compiler—and the advanced compiler commonly called Crankshaft.

Full compiler

This compiler is responsible for generating bytecode as quickly as possible, it doesn’t optimize the generated code, and it’s the entry point of any javascript code.

Special mention in this point, the code evaluated in V8 doesn’t have to always be compiled, usually the code of a function is not compiled until it’s used for the first time. The corresponding AST is simply generated, but in the case of jQuery (to say any library), it will not generate all of its bytecode, but only the functions that are used. With this V8 gets the pages to load faster—and we know how important the loading speed of a page is in this web thing.

How do you comfort a JavaScript bug? … You console it

Each architecture (x86, x64…) contains its own file full-codegen-arch.cc which builds the specific bytecode for that architecture, so every time the V8 team wants to add support for a specific architecture, it has to Write a specification of the instructions of that architecture in that type of file to support that platform.

Crankshaft

It’s the cool friend. From your group of colleagues, Crankshaft is the coolest guy of all, the one that links with everyone and makes you and your colleagues stay with two candles.

Actually Crankshaft is the marketing name of the V8 code optimizing compiler. We start from the basis that Crankshaft works thanks to the code generated by the full compiler, that being said, it’s in charge of detecting “important” functions, or functions that should be optimized. This could be, for example, a function containing a loop with a high number of iterations, the Crankshaft logic will detect this loop, and optimize it in runtime to obtain the maximum possible performance from it.

Starting from the code generated by the base compiler, Crankshaft acts with 3 components:

• Runtime profiler: It’s responsible for monitoring the system and identifying the code that needs to be optimized, e.g., code that is continuously executed in a certain time interval.

• Optimizing compiler: Recompiles and optimizes the code that identifies the runtime profiler.

• Deoptimization support: Suppose that the code optimizer has been too optimistic optimizing a part of the javascript code, the deoptimizer allows us to rescue the original code generated by the base compiler, thus avoiding a failure in the optimization of a code.

Crankshaft’s work is not based only in these components, but they are the most important part of it.

There is another component called Hydrogen to which we will dedicate a whole chapter in the future… But only if you have been left wanting more bytecode.

You can lock us away, but you’ll never defeat The Cobra’s!! Oink oink oink.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.