Close

342 days later .... Magics of Kaitai Struct

A project log for Tsukuyomi

Hacking Lunii

dan-kszdan ksz 10/10/2019 at 13:120 Comments

Due to some personal circumstances and then to the COVID-19 confinement, I didn't find the time to work effectively on Tsukuyomi and also to update the project logs. 

Nevertheless in the past log I've reached a good milestone in understanding the Lunii image format, which lead me to: "I have to create python script to parse the Lunii images contents".

Then thanks to a HAM Radio Project I discovered the Holy Graal of binary parsing: Kaitai Struct.

Before starting this log It worth noting that thanks to a message in the Public Chat I came a cross an other project https://github.com/marian-m12l/studio that build a complete and advanced application written in JAVA to update Lunii through the USB. Unfortunately I didn't succeed yet to use it but the project seems to contains valuable information about the format of Lunii Image, things that I'm intending to take off if I'm blocked somewhere. 

1. Falling in Love with Kaitai Struct

Kaitai Struct basically defines a declarative language to describe binary data structures within a .ksy file and then compile it into source files of your preferred programming languages. 

$ ksc --target python kaitai-struct/lunii.ksv

So after playing with Kaitai Struct in this powerful online IDE using some small portions of Lunii image dumps and also locally with Kaitai Struct Compiler (KSC) and Kaitai Struct Vsualizer  (KSV) by using the whole Lunii image dump, I construct iteratively a first description of Lunii image format in Kaitai Struct Language lunii.ksy , that consist of 3 principal structures: content_struct, story_struct and node_struct. It also import Kaitai Struct definitions of Bitmap and Wav files that I copied from Kaitai Struct Formats Library repository that host a wonderful amount of format description in Kaitai Struct Language.

a. content_struct

In Lunii image I found out that the content from 0 up the offset 0x030D_4000 contains no useful information beside the version and some system medias (Battery, Error, USB ...)  So I decided to ignore it and just call it Header. 

meta:
  id: lunii
  file-extension: lunii
  endian: be
  imports:
    - bmp
    - wav
seq:
   - id: header
     size: fileoffset
   - id: contents
     type: content_struct
instances:
   fileoffset:
     value: 0x30D4000

Starting from 0x030D_4000 address useful things start: the definition of the content especially the value of number of stories and stories information (start_address and size).

content_struct:
    seq:
      - id: nbr_stories
        type: u2
      - id: stories
        type: story_struct
        repeat: expr
        repeat-expr: nbr_stories

b. story_struct

Stories information contains the start address and size values in Sector unit (512 Bytes) next to value that I have to figure out but I called it unknown

In order to calculate more easily a precise address of the story fields (needed more further), I had to reference the absolute location of the start address in special variable called abs_start_address.

  story_struct:
    seq:
      - id: start_address
        type: u4
      - id: size
        type: u4
      - id: unknown
        type: u4
    instances:
      abs_start_address:
        value: _root.fileoffset + (start_address * 0x200)
      story_info:
        pos: abs_start_address
        type: story_info_struct

So the story_info_struct define a story by a number of nodes, a boolean (that I believe refers to factory_disabled), a uint16_t for the version of the story., and after some zeros story nodes start.

A node is a story media storage unit, that contains the data for the Bitmap and/or the Audio Wav. 

  story_info_struct:
    seq:
      - id: nbr_nodes
        type: u2
      - id: factory_disabled
        type: u1
      - id: version
        type: u2
      - id: padding
        size: 0x200 - 5 #TODO this must be reworked
      - id: nodes
        type: node_struct
        repeat: expr
        repeat-expr: nbr_nodes

c. node_struct

The node struct contains the start addresses and sizes of the media in addition to a structure that I called navigation_struct which I believe it holds information about navigation and transition from one node to another.

  node_struct:
    seq:
      - id: uuid
        size: 16
      - id: image_start_sector
        type: u4
      - id: image_size
        type: u4
      - id: audio_start_sector
        type: u4
      - id: audio_size
        type: u4
      - id: navigation
        type: navigation_struct
        size: 0x200 - 32
    instances:
##### this is just for debugging
      story_start_address:
        value: _parent._parent.abs_start_address
      image_start_address:
        value: story_start_address + ((1 + image_start_sector) * 0x200)
      audio_start_address:
        value: story_start_address + ((1 + audio_start_sector) * 0x200)
#####
      image:
        type: bmp
        pos: image_start_address
        size: image_size * 0x200
        if: image_size != 0xffffffff
      audio:
        type: wav
        pos: audio_start_address
        size: audio_size * 0x200
        if: audio_size != 0xffffffff
      image_raw:
        pos: image_start_address
        size: image_size * 0x200
        if: image_size != 0xffffffff
      audio_raw:
        pos: audio_start_address
        size: audio_size * 0x200
        if: audio_size != 0xffffffff 

The complete KSY file is stored in the github repository of the project: https://github.com/danksz/tsukuyomi

d. Visualization 

As said before KSV is very handy tool to visualize the format description applied to the binary image. 

$ ksv Lunii.img lunii.ksy

The only inconvenience with KSV is that it takes a very longtime to parse Lunii Image (8 GB) and also require a lot of RAM. so killing chrome/firefox before may help ;)  

So after checking with KSV that the KSY definition works great for one or two nodes and for the same for stories, I compiled it in a python module: lunii.py 

$ ksc --target python kaitai-struct/lunii.ksv

2. The Parser and The Explorer

Developing python parser of Lunii image was very straightforward: sd-lunii-parser.py after generating the lunii.py module. 

The parser will display the whole structure of Lunii Image and where medias (Bitmap and Wav sounds) are stored.  

$ ./sd-lunii-parser.py ../dump/Lunii.img 

position of content structure: 51200000
position of number of stories: 51200062
Number of stories: 5

Story[00]
    address:   0x030D4200
    size:      0x2BFDC800
    sector:    0x00000001
    nbr_nodes: 168
    node[000]
        audio
            addr:   0x03A9E800
            size:   187904
            object: <wav.Wav object at 0x7f53d86a9be0>
        image
            addr:   0x30EE000
            size:   230912
            object: <bmp.Bmp object at 0x7f53d86a9e48>

    node[001]
        audio
            addr:   0x03ACC600
            size:   255488
            object: <wav.Wav object at 0x7f53d86a9f60>
        image
            addr:   0x3126600
            size:   230912
            object: <bmp.Bmp object at 0x7f53d86b0208>

    node[002]
        audio
            addr:   0x03B0AC00
            size:   199680
            object: <wav.Wav object at 0x7f53d86b0320>
        image
            addr:   0x200030D4200
            size:   2199023255040
            object: None

[..]

The most exciting thing in Kaitai Struct is that it comes with a lot of other tools and libraries beside KSC , KSV, Kaitai Struct Format Library. Example: KaitaiFS that makes possible to mount in FUSE a filesystem while providing only the generated python module: 

$ python3 -m kaitaifs.generic lunii Lunii.img mountpoint 

So after mounting the image I can then use a file browser to explore the storage-media like exploring an USB or SDcard drive formated with FAT32 filesystem.

I've also successfully opened the audio files in a media player (ex: VLC). 

The Extractor and the Combiner

Just like sd-lunii-parser.py script, creating additional python scripts that use the generated lunii.py module, is very easy.

The sd-lunii-extract-stories.py parse Lunii image and extract all founded stories into a folder named out-extract

Number of stories: 5

Story[00]
    address:   0x030D4200
    size:      0x2BFDC800
    sector:    0x00000001
    nbr_nodes: 168

Story[01]
    address:   0x2F0B0A00
    size:      0x0B7A7C00
    sector:    0x0015FEE5
    nbr_nodes: 019

Story[02]
    address:   0x3A858600
    size:      0x08445A00
    sector:    0x001BBC23
    nbr_nodes: 199

Story[03]
    address:   0x42C9E000
    size:      0x08BDFA00
    sector:    0x001FDE50
    nbr_nodes: 019

Story[04]
    address:   0x4B87DA00
    size:      0x14FBB200
    sector:    0x00243D4D
    nbr_nodes: 055
Extracting story n:0 to out-extract/Lunii-story0.lunii
Extracting story n:1 to out-extract/Lunii-story1.lunii
Extracting story n:2 to out-extract/Lunii-story2.lunii
Extracting story n:3 to out-extract/Lunii-story3.lunii
Extracting story n:4 to out-extract/Lunii-story4.lunii

 Unfortunately Kaitai Struct currently support only reading binary (Writing it seems to be supported only in a  development branch and for JAVA only). But thanks to some references I've put already in the KSY file I can **at least** add some extracted stories from a Lunii image to another Lunii Image. 

So I had only to reconstruct manually the content structure as in sd-lunii-concat-stories.py  :

    nbr_stories_bin = nbr_stories.to_bytes(2, byteorder='big',signed=False)     
                                                                                
    stories_struct_bin = b""                                                    
                                                                                
    for pos, size in added_stories:                                             
        pos    -= fileoffset                                                    
        pos   //= 0x200                                                         
        size  //= 0x200                                                                                                                                                                                                                                                                                                                                                       
        unknown = 0                                                             
        pos_bin     = pos.to_bytes(4, byteorder='big',signed=False)             
        size_bin    = size.to_bytes(4, byteorder='big',signed=False)            
        unknown_bin = unknown.to_bytes(4, byteorder='big',signed=False)         
        stories_struct_bin += pos_bin + size_bin + unknown_bin                  
                                                                                
    content_bin = nbr_stories_bin + stories_struct_bin                          
    bfile.seek(fileoffset)                                                      
    bfile.write(content_bin)

Thus sd-lunii-concat-stories.py  can generate a new Lunii Image which contains both original and already extracted stories:

 $ ./sd-lunii-concat-stories.py ../../dumps/Adam_Lunii.img out-extract2
Number of stories: 5

Story[00]
    address:   0x030D4200
    size:      0x2BFDC800
    sector:    0x00000001
    nbr_nodes: 168

Story[01]
    address:   0x2F0B0A00
    size:      0x0B7A7C00
    sector:    0x0015FEE5
    nbr_nodes: 019

Story[02]
    address:   0x3A858600
    size:      0x08445A00
    sector:    0x001BBC23
    nbr_nodes: 199

Story[03]
    address:   0x42C9E000
    size:      0x08BDFA00
    sector:    0x001FDE50
    nbr_nodes: 019

Story[04]
    address:   0x4B87DA00
    size:      0x14FBB200
    sector:    0x00243D4D
    nbr_nodes: 055
Copy base image to out/Adam_Lunii.img-mod.img
Adding Alice_Lunii-story1.lunii at 0x60838C00
Adding Alice_Lunii-story3.lunii at 0x6BFE0800
Adding Alice_Lunii-story4.lunii at 0x74BC0200
Adding Fred_Lunii-story1.lunii at 0x89B7B400
Adding Fred_Lunii-story2.lunii at 0x94DA4C00
Adding Fred_Lunii-story3.lunii at 0x9CDF4800
Adding Fred_Lunii-story4.lunii at 0xB3EBCC00

Let's now execute the parser script to check if sd-lunii-concat-stories.py  generated a correct Lunii Image:

 $ ./sd-lunii-parser.py out/Adam_Lunii.img-mod.img    
position of content structure: 51200000
position of number of stories: 51200146
Number of stories: 12            
                                                      
Story[00]                     
    address:   0x030D4200 
    size:      0x2BFDC800                             
    sector:    0x00000001     
    nbr_nodes: 168           
    node[000]                                         
        audio                                         
            addr:   0x03A9E800   
            size:   187904       
            object: <wav.Wav object at 0x7fbc803f5dd8>
        image                 
            addr:   0x30EE000
            size:   230912                            
            object: <bmp.Bmp object at 0x7fbc803fc080>
                             
    node[001]                                         
        audio                                         
            addr:   0x03ACC600   
            size:   255488       
            object: <wav.Wav object at 0x7fbc803fc198>
        image                 
            addr:   0x3126600
            size:   230912                            
            object: <bmp.Bmp object at 0x7fbc803fc400> 
[..]
Story[05]
    address:   0x60838C00
    size:      0x0B7A7C00
    sector:    0x002EBB26
    nbr_nodes: 019
    node[000]
        audio
            addr:   0x60A38400
            size:   133120
            object: <wav.wav object="" at="" 0x7fcb2b967f60="">
        image
            addr:   0x6083CE00
            size:   230912
            object: <bmp.bmp object="" at="" 0x7fcadeae71d0="">

    node[001]
        audio
            addr:   0x60A58C00
            size:   202240</bmp.bmp></wav.wav>                                 
[..]

 Coooool!! the number of stories indeed was increased to 12 instead of 5. and new stories appears !!! 

And the more important check: After writing the new Lunii Image to the sdcard of Lunii, all stories (old and new ones) was loaded SUCCESSFULLY!!!!

Next Steps

Try to understand the Navigation/Transition format to build a new story.

Discussions