For 4.0.0, the trained data is forked from TessData Github, and gzip by following command:
$ sh gzip-traineddata.sh 4.0.0
Most of these were downloaded and extracted from Tesseract’s Google Code page. This repository also includes John Lin’s meme-ocr traineddata. It also includes the latest (2014-05-01) release of the Ancient Greek traineddata.