|
|
- # TODO list
-
- ## Release v0.6
-
- 1. Review encoder and check for lzma improvements under xz.
- 2. Fix binary tree matcher.
- 3. Compare compression ratio with xz tool using comparable parameters
- and optimize parameters
- 4. Do some optimizations
- - rename operation action and make it a simple type of size 8
- - make maxMatches, wordSize parameters
- - stop searching after a certain length is found (parameter sweetLen)
-
- ## Release v0.7
-
- 1. Optimize code
- 2. Do statistical analysis to get linear presets.
- 3. Test sync.Pool compatability for xz and lzma Writer and Reader
- 3. Fuzz optimized code.
-
- ## Release v0.8
-
- 1. Support parallel go routines for writing and reading xz files.
- 2. Support a ReaderAt interface for xz files with small block sizes.
- 3. Improve compatibility between gxz and xz
- 4. Provide manual page for gxz
-
- ## Release v0.9
-
- 1. Improve documentation
- 2. Fuzz again
-
- ## Release v1.0
-
- 1. Full functioning gxz
- 2. Add godoc URL to README.md (godoc.org)
- 3. Resolve all issues.
- 4. Define release candidates.
- 5. Public announcement.
-
- ## Package lzma
-
- ### Release v0.6
-
- - Rewrite Encoder into a simple greedy one-op-at-a-time encoder
- including
- + simple scan at the dictionary head for the same byte
- + use the killer byte (requiring matches to get longer, the first
- test should be the byte that would make the match longer)
-
-
- ## Optimizations
-
- - There may be a lot of false sharing in lzma.State; check whether this
- can be improved by reorganizing the internal structure of it.
- - Check whether batching encoding and decoding improves speed.
-
- ### DAG optimizations
-
- - Use full buffer to create minimal bit-length above range encoder.
- - Might be too slow (see v0.4)
-
- ### Different match finders
-
- - hashes with 2, 3 characters additional to 4 characters
- - binary trees with 2-7 characters (uint64 as key, use uint32 as
- pointers into a an array)
- - rb-trees with 2-7 characters (uint64 as key, use uint32 as pointers
- into an array with bit-steeling for the colors)
-
- ## Release Procedure
-
- - execute goch -l for all packages; probably with lower param like 0.5.
- - check orthography with gospell
- - Write release notes in doc/relnotes.
- - Update README.md
- - xb copyright . in xz directory to ensure all new files have Copyright
- header
- - VERSION=<version> go generate github.com/ulikunitz/xz/... to update
- version files
- - Execute test for Linux/amd64, Linux/x86 and Windows/amd64.
- - Update TODO.md - write short log entry
- - git checkout master && git merge dev
- - git tag -a <version>
- - git push
-
- ## Log
-
- ### 2019-02-20
-
- Release v0.5.6 supports the go.mod file.
-
- ### 2018-10-28
-
- Release v0.5.5 fixes issues #19 observing ErrLimit outputs.
-
- ### 2017-06-05
-
- Release v0.5.4 fixes issues #15 of another problem with the padding size
- check for the xz block header. I removed the check completely.
-
- ### 2017-02-15
-
- Release v0.5.3 fixes issue #12 regarding the decompression of an empty
- XZ stream. Many thanks to Tomasz Kłak, who reported the issue.
-
- ### 2016-12-02
-
- Release v0.5.2 became necessary to allow the decoding of xz files with
- 4-byte padding in the block header. Many thanks to Greg, who reported
- the issue.
-
- ### 2016-07-23
-
- Release v0.5.1 became necessary to fix problems with 32-bit platforms.
- Many thanks to Bruno Brigas, who reported the issue.
-
- ### 2016-07-04
-
- Release v0.5 provides improvements to the compressor and provides support for
- the decompression of xz files with multiple xz streams.
-
- ### 2016-01-31
-
- Another compression rate increase by checking the byte at length of the
- best match first, before checking the whole prefix. This makes the
- compressor even faster. We have now a large time budget to beat the
- compression ratio of the xz tool. For enwik8 we have now over 40 seconds
- to reduce the compressed file size for another 7 MiB.
-
- ### 2016-01-30
-
- I simplified the encoder. Speed and compression rate increased
- dramatically. A high compression rate affects also the decompression
- speed. The approach with the buffer and optimizing for operation
- compression rate has not been successful. Going for the maximum length
- appears to be the best approach.
-
- ### 2016-01-28
-
- The release v0.4 is ready. It provides a working xz implementation,
- which is rather slow, but works and is interoperable with the xz tool.
- It is an important milestone.
-
- ### 2016-01-10
-
- I have the first working implementation of an xz reader and writer. I'm
- happy about reaching this milestone.
-
- ### 2015-12-02
-
- I'm now ready to implement xz because, I have a working LZMA2
- implementation. I decided today that v0.4 will use the slow encoder
- using the operations buffer to be able to go back, if I intend to do so.
-
- ### 2015-10-21
-
- I have restarted the work on the library. While trying to implement
- LZMA2, I discovered that I need to resimplify the encoder and decoder
- functions. The option approach is too complicated. Using a limited byte
- writer and not caring for written bytes at all and not to try to handle
- uncompressed data simplifies the LZMA encoder and decoder much.
- Processing uncompressed data and handling limits is a feature of the
- LZMA2 format not of LZMA.
-
- I learned an interesting method from the LZO format. If the last copy is
- too far away they are moving the head one 2 bytes and not 1 byte to
- reduce processing times.
-
- ### 2015-08-26
-
- I have now reimplemented the lzma package. The code is reasonably fast,
- but can still be optimized. The next step is to implement LZMA2 and then
- xz.
-
- ### 2015-07-05
-
- Created release v0.3. The version is the foundation for a full xz
- implementation that is the target of v0.4.
-
- ### 2015-06-11
-
- The gflag package has been developed because I couldn't use flag and
- pflag for a fully compatible support of gzip's and lzma's options. It
- seems to work now quite nicely.
-
- ### 2015-06-05
-
- The overflow issue was interesting to research, however Henry S. Warren
- Jr. Hacker's Delight book was very helpful as usual and had the issue
- explained perfectly. Fefe's information on his website was based on the
- C FAQ and quite bad, because it didn't address the issue of -MININT ==
- MININT.
-
- ### 2015-06-04
-
- It has been a productive day. I improved the interface of lzma.Reader
- and lzma.Writer and fixed the error handling.
-
- ### 2015-06-01
-
- By computing the bit length of the LZMA operations I was able to
- improve the greedy algorithm implementation. By using an 8 MByte buffer
- the compression rate was not as good as for xz but already better then
- gzip default.
-
- Compression is currently slow, but this is something we will be able to
- improve over time.
-
- ### 2015-05-26
-
- Checked the license of ogier/pflag. The binary lzmago binary should
- include the license terms for the pflag library.
-
- I added the endorsement clause as used by Google for the Go sources the
- LICENSE file.
-
- ### 2015-05-22
-
- The package lzb contains now the basic implementation for creating or
- reading LZMA byte streams. It allows the support for the implementation
- of the DAG-shortest-path algorithm for the compression function.
-
- ### 2015-04-23
-
- Completed yesterday the lzbase classes. I'm a little bit concerned that
- using the components may require too much code, but on the other hand
- there is a lot of flexibility.
-
- ### 2015-04-22
-
- Implemented Reader and Writer during the Bayern game against Porto. The
- second half gave me enough time.
-
- ### 2015-04-21
-
- While showering today morning I discovered that the design for OpEncoder
- and OpDecoder doesn't work, because encoding/decoding might depend on
- the current status of the dictionary. This is not exactly the right way
- to start the day.
-
- Therefore we need to keep the Reader and Writer design. This time around
- we simplify it by ignoring size limits. These can be added by wrappers
- around the Reader and Writer interfaces. The Parameters type isn't
- needed anymore.
-
- However I will implement a ReaderState and WriterState type to use
- static typing to ensure the right State object is combined with the
- right lzbase.Reader and lzbase.Writer.
-
- As a start I have implemented ReaderState and WriterState to ensure
- that the state for reading is only used by readers and WriterState only
- used by Writers.
-
- ### 2015-04-20
-
- Today I implemented the OpDecoder and tested OpEncoder and OpDecoder.
-
- ### 2015-04-08
-
- Came up with a new simplified design for lzbase. I implemented already
- the type State that replaces OpCodec.
-
- ### 2015-04-06
-
- The new lzma package is now fully usable and lzmago is using it now. The
- old lzma package has been completely removed.
-
- ### 2015-04-05
-
- Implemented lzma.Reader and tested it.
-
- ### 2015-04-04
-
- Implemented baseReader by adapting code form lzma.Reader.
-
- ### 2015-04-03
-
- The opCodec has been copied yesterday to lzma2. opCodec has a high
- number of dependencies on other files in lzma2. Therefore I had to copy
- almost all files from lzma.
-
- ### 2015-03-31
-
- Removed only a TODO item.
-
- However in Francesco Campoy's presentation "Go for Javaneros
- (Javaïstes?)" is the the idea that using an embedded field E, all the
- methods of E will be defined on T. If E is an interface T satisfies E.
-
- https://talks.golang.org/2014/go4java.slide#51
-
- I have never used this, but it seems to be a cool idea.
-
- ### 2015-03-30
-
- Finished the type writerDict and wrote a simple test.
-
- ### 2015-03-25
-
- I started to implement the writerDict.
-
- ### 2015-03-24
-
- After thinking long about the LZMA2 code and several false starts, I
- have now a plan to create a self-sufficient lzma2 package that supports
- the classic LZMA format as well as LZMA2. The core idea is to support a
- baseReader and baseWriter type that support the basic LZMA stream
- without any headers. Both types must support the reuse of dictionaries
- and the opCodec.
-
- ### 2015-01-10
-
- 1. Implemented simple lzmago tool
- 2. Tested tool against large 4.4G file
- - compression worked correctly; tested decompression with lzma
- - decompression hits a full buffer condition
- 3. Fixed a bug in the compressor and wrote a test for it
- 4. Executed full cycle for 4.4 GB file; performance can be improved ;-)
-
- ### 2015-01-11
-
- - Release v0.2 because of the working LZMA encoder and decoder
|