Free Lossless Image Format, or FLIF is the precursor to JPEG XL and FUIF. It claims to have a higher level of compression or lower file size than conventional lossless image formats like PNG, Lossless JPEG 2000, etc. FLIF and FUIF development has been stopped in favour of JPEG XL.
The task was to write an encoder and decoder for this format for FFmpeg.
This project was undertaken by two people, which is apparently unusual for GSOC:
Both of whom had submitted independent proposals for the same project. The public repository being used for communication during development is available here.
$ > git diff --stat HEAD~4
Changelog | 3 +-
configure | 1 +
doc/general.texi | 2 +
libavcodec/Makefile | 3 +
libavcodec/allcodecs.c | 1 +
libavcodec/codec_desc.c | 7 +
libavcodec/codec_id.h | 1 +
libavcodec/flif16.c | 204 +++
libavcodec/flif16.h | 287 ++++
libavcodec/flif16_parser.c | 193 +++
libavcodec/flif16_rangecoder.c | 800 +++++++++++
libavcodec/flif16_rangecoder.h | 400 ++++++
libavcodec/flif16_transform.c | 2895 ++++++++++++++++++++++++++++++++++++++++
libavcodec/flif16_transform.h | 124 ++
libavcodec/flif16dec.c | 1764 ++++++++++++++++++++++++
libavcodec/parsers.c | 1 +
libavcodec/version.h | 2 +-
libavformat/Makefile | 1 +
libavformat/allformats.c | 1 +
libavformat/flifdec.c | 431 ++++++
libavformat/version.h | 2 +-
21 files changed, 7120 insertions(+), 3 deletions(-)
This is a list of links to the iterations of the source code posted to the development channels of FFmpeg. The latest iteration is version 6. (faulty patches are not shown)
v3:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267742.html
v5:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267920.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267921.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267922.html
v6:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268473.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268474.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268475.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268476.html
Developing for FFmpeg requires the interested party to read through a lot of the existing code base and refering to existing components of the same type for which one wants to develop, which I think is common and required for any software project.
However I think there could be a few more pointers as to what a novice developer has to do, such as being told to expect reading through a lot of source code, and what files must one refer to if one wants to develop a component.
In the initial stages of working for this project, I was in a disarray as to what to do and where to start from. Understanding what a component even does was a large part of the time spent.
For example, The description of libavformat
in FFmpeg’s documentation says that it deals with format demuxers, muxers and I/O, which means that, say for a video file, the audio and video streams are separated (or demuxed) or joined together (muxed) for processing into decoders. This is the primary purpose of libavformat
, but its usage of the term (de)muxer is broader than that.
A demuxer in libavformat
is also responsible for probing the said format, reading the header of the format, and, of course, what is sent to the decoder as the packet. All of this could have been inferred from the phrase ‘I/O’ in the description but I initially did not make the connection. This all makes sense when one gets the idea of a clear distinction between container formats (eg. all video files with sound such as webm) and encoding formats (the format of an audio or a video stream), which I did have an intuitive sense of, but wasn’t articulate. A lot of the confusion came from the meaning of the term demuxer and whether or not it is used for only container formats or encoding formats as well. This brings us to the next section:
Kartik has made his own comments here. Please check them out.
Maybe a lot of the above problems could have been completely avoided and would have saved our time if we had went through the FFmpeg manpage thoroughly as well as rest of the documentation (non-API).
My (Anamitra) experience with FFmpeg before the project was limited to clipping videos, extracting individual frames from gifs, conversion between image formats and reducing file sizes of videos, all of which I assume are the most common use cases of FFmpeg. Therefore I did not have much other than a superficial understanding of FFmpeg.
However, after working on the project for about half to one month, a lot of the on-site documentation now seems obvious to me. This was after having to go through the sourcecode of concerned files again and again and after writing a working template of the decoder by hand. The FFmpeg codec HOWTO on MultimediaWiki does give useful pointers, but can be updated to provide a better description on how to write codecs for still image formats and gif like formats, which was the main hurdle for understanding in the context of our project, and in general a better anatomy of FFmpeg itself (which the manpages do provide, but not in a very direct manner and in the sense of the internal API itself). I have started to write a document about this for my own reference.
In conclusion, we think there is a need for slightly better developer documentation, and providing pointers to new developers and a description what to expect while trying to develop for the code base. However, I think the hurdles I had encountered while developing for FFmpeg (and still am) are necessary to pass through for any developer to learn two important things:
Talking with others (other than ourselves) never really took a center stage during the project (which the size and nature of the project had a hand in, and because of which we could not frequently send out patches). Most of what we were able to do was by reading through source code of FLIF and FFmpeg.
Kartik has made his own comments here. Please check them out.
One of the main decisions made while writing the decoder was to convert it into a state machine, such that it can save states between intermittent packets. This was done because FLIF does not provide a simple method to determine the end of the file bitstream, since the entirety of the image data has been entropy coded. Aditionally, the whole frame data has been interleaved, which makes it impossible to split the data into frames without reading the whole of the image data first.
The reference decoder allocates frame data for duplicate frames and copies frame data into them. We could not find one reason why would one do this. So, at the very least, in this aspect, our decoder is more memory efficient.
The reference decoder crashes on certain files while trying to decode eXmp metadata. This does not happen with our decoder.
We may eventually rewrite the FLIF specification, such that the components are much clearer.