| 123456789101112131415161718192021222324252627282930313233343536373839404142434445 | 
							
- #------------------------------------------------------------------------------
 
- # $File: statistics,v 1.2 2020/10/08 17:51:53 christos Exp $
 
- # statistics:  file(1) magic for statistics related software
 
- #
 
- # From Remy Rampin
 
- # Stata is a statistical software tool that was created in 1985. While I
 
- # don't personally use it, data files in its native (proprietary) format
 
- # are common (.dta files).
 
- # 
 
- # Because they are so common, especially in statistical and social
 
- # sciences, Stata files and SPSS files can be opened by a lot of modern
 
- # software, for example Python's pandas package provides built-in
 
- # support for them (read_stata() and read_spss()).
 
- # 
 
- # I noticed that the magic database includes an entry for SPSS files but
 
- # not Stata files. Stata files for Stata 13 and newer (formats 117, 118,
 
- # and 119) always begin with the string "<stata_dta><header>" as per
 
- # https://www.stata.com/help.cgi?dta#definition
 
- # 
 
- # The format version number always follows, for example:
 
- #    <stata_dta><header><release>117</release>
 
- #    <stata_dta><header><release>118</release>
 
- # 
 
- # Therefore the following line would do the trick:
 
- #    0       string  <stata_dta><header>     Stata Data File
 
- # 
 
- # (I'm sure the version number could be captured as well but I did not
 
- # manage this without a regex)
 
- # 
 
- # Unfortunately the previous formats (created by Stata before 13, which
 
- # was released 2013) are harder to recognize. Format 115 starts with the
 
- # four bytes 0x73010100 or 0x73020100, format 114 with 0x72010100 or
 
- # 0x72020100, format 113 with 0x71010101 or 0x71020101.
 
- # 
 
- # For additional reference, the Library of Congress website has an entry
 
- # for the Stata Data File Format 118:
 
- # https://www.loc.gov/preservation/digital/formats/fdd/fdd000471.shtml
 
- # 
 
- # Example of those files can be found on Zenodo:
 
- # https://zenodo.org/search?page=1&size=20&q=&file_type=dta
 
- 0	string	\<stata_dta\>\<header\>\<release\>	Stata Data File
 
- >&0	regex	[0-9]*					(Release %s)
 
 
  |