This is part 2 of the Darwinex API series (part 1 is here). We will parse Darwinex’s historical data, which is in the JSON format. Normally, we would use Zorro’s dataParseJSON() function for such a task, but the function’s present incarnation does not appear to support the eccentric format that we have downloaded. (Perhaps someone will fix this later?) So we will explore an alternative method where we parse using a commonly available JSON library.
The JSON Data
When you first download historical data from the Darwinex API, you will find that it looks like this. Since the data was intended to be immediately parsed by a computer, any and all spaces and return characters have been removed.
However, in order to parse the JSON data, you need to understand its structure. For our analysis, we can convert the JSON into a friendlier format, known as “pretty print”. It looks like this, and it’s much more readable. Here is a web tool that will pretty-print your JSON.
So this is the current structure of the JSON data:
- Candles array object
- Candles element #0 only has a timestamp for some reason.
- The remaining candles elements are comprised of two elements:
- candle element, consisting of
- close
- max
- min
- open
- timestamp (in unix epoch format)
- candle element, consisting of
- Period object
- Resolution object
- Supported resolutions array object
Observing the above structure, we can deduce a simple value acquisition plan:
- We cannot pick up the timestamp whose candle object is missing. So we start by finding the first “close” string value.
- The immediately following value will then be the float value of the close.
- Repeat steps 1 and 2 for “max”, “min”, “open”, and “timestamp”, in this exact order.
- You now have a single entry, so record it.
- Repeat steps 1 through 4 until you get to the end of the JSON file.
The beauty of this approach is its simplicity. However, this approach can get broken if Darwinex decides to change the API, such as the field sort order. If this ever does occur, all you have to do is sort your code, and it is fixed. There are, of course, more bulletproof methods, but that will not be in the scope of this post.
The JSON Parser
Consider not writing your own JSON parser if you do not have to. And in this case, we still do not have to. We will use a freely available C library called JSMN. With very minimal modifications, I was able to convert this library into a Zorro-friendly library. Let’s call this tweaked library ZJSMN.
To use ZJSMN, all you need is:
- zjsmn.c in Zorro’s “include” directory.
- #include <zjsmn.c> at the top of any script where it is needed.
- I suppose you should learn how this library works. There are articles about this. I will touch on this briefly.
Here is a Zorro script that uses ZJSMN to parse the Darwin historical JSON data. All of the files that you will need for this script can be found here. Simply add all of these files to your Zorro folder, in the structure shown.
Now that you have the code, let me walk you through what is happening:
- At the top of main(), we allocate enough memory and load our file.
- We also set up a Zorro dataSet handle. A Zorro dataSet will be used to accumulate historical data in the T6 format.
- Next, we prepare a parser object and an array of 100,000 tokens, which for our purposes is more than enough. A token is basically an element identifier, which cross-references to our JSON string. An element can be anything in the JSON structure such as an object, array, key, or value, so there will be a token for all of these things.
- We initialize the parser.
- Then we get all of our tokens. We also learn how many tokens were parsed; we will need this info for the read loop.
- We prepare a single T6 struct and three indices for the read loop:
- One for the token number
- One to keep track of which key we are searching for (“close”, “max”, “min”, “open”, “timestamp”).
- One just to count how many entries we added.
- We have a subroutine key_matches(), which checks for a matching key. If there is a match, it checks the next token for a value and outputs to valueOutput and returns true. Otherwise, it returns false.
- Once we have collected all values, we use Zorro’s dataAppendRow() function to acquire a new row for our dataSet. A T6 struct has seven fields, so we need to request seven fields.
- We fill the row with our entry, and we move on to the next token.
- Finally, we sort our dataSet and save to a binary t6 file.
I should add a note about the T6 data. Since it is daily data, we actually round the time to the beginning of the nearest day in UTC. Fortunately, this only involves rounding to the nearest integer, as we are using the DATE format when using Zorro.
And, of course, always free any memory you allocate.
Happy parsing!