| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535 | .TH FILE __CSECTION__ "March 2006" "Debian/GNU Linux" "Copyrighted but distributable".\" $Id: file.man,v 1.57 2005/08/18 15:18:22 christos Exp $.SH NAMEfile\- determine file type.SH SYNOPSIS.B file[.B \-bchikLnNprsvz][.B \-f.I namefile][.B \-F.I separator][.B \-m .I magicfiles].I file\&....br.B file.B -C[.B \-m magicfile ].SH DESCRIPTIONThis manual page documents version __VERSION__ of the.B filecommand..PP.B Filetests each argument in an attempt to classify it.There are three sets of tests, performed in this order:filesystem tests, magic number tests, and language tests.The.I firsttest that succeeds causes the file type to be printed..PPThe type printed will usually contain one of the words.B text(the file contains onlyprinting characters and a few common controlcharacters and is probably safe to read on an.SM ASCIIterminal),.B executable(the file contains the result of compiling a programin a form understandable to some \s-1UNIX\s0 kernel or another),or.B datameaning anything else (data is usually `binary' or non-printable).Exceptions are well-known file formats (core files, tar archives)that are known to contain binary data.When adding local definitions to.IR /etc/magic ,.BR "preserve these keywords" .People depend on knowing that all the readable files in a directoryhave the word ``text'' printed.Don't do as Berkeley did and change ``shell commands text''to ``shell script''.Note that the file.I __MAGIC__is built mechanically from a large number of small files inthe subdirectory.I Magdirin the source distribution of this program..PPThe filesystem tests are based on examining the return from a.BR stat (2)system call.The program checks to see if the file is empty,or if it's some sort of special file.Any known file types appropriate to the system you are running on(sockets, symbolic links, or named pipes (FIFOs) on those systems thatimplement them)are intuited if they are defined inthe system header file.IR <sys/stat.h>  ..PPThe magic number tests are used to check for files with data inparticular fixed formats.The canonical example of this is a binary executable (compiled program).I a.outfile, whose format is defined in .I a.out.hand possibly.I exec.hin the standard include directory.These files have a `magic number' stored in a particular placenear the beginning of the file that tells the \s-1UNIX\s0 operating systemthat the file is a binary executable, and which of several types thereof.The concept of `magic number' has been applied by extension to data files.Any file with some invariant identifier at a small fixedoffset into the file can usually be described in this way.The information identifying these files is read from.I /etc/magicand the compiledmagic file.I __MAGIC__.mgc ,or .I __MAGIC__if the compile file does not exist. In addition.B filewill look in.I $HOME/.magic.mgc ,or.I $HOME/.magicfor magic entries..PPIf a file does not match any of the entries in the magic file,it is examined to see if it seems to be a text file.ASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character sets(such as those used on Macintosh and IBM PC systems),UTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDICcharacter sets can be distinguished by the differentranges and sequences of bytes that constitute printable textin each set.If a file passes any of these tests, its character set is reported.ASCII, ISO-8859-x, UTF-8, and extended-ASCII files are identifiedas ``text'' because they will be mostly readable on nearly any terminal;UTF-16 and EBCDIC are only ``character data'' because, whilethey contain text, it is text that will require translationbefore it can be read.In addition,.B filewill attempt to determine other characteristics of text-type files.If the lines of a file are terminated by CR, CRLF, or NEL, insteadof the Unix-standard LF, this will be reported.Files that contain embedded escape sequences or overstrikingwill also be identified..PPOnce.B filehas determined the character set used in a text-type file,it willattempt to determine in what language the file is written.The language tests look for particular strings (cf.IR names.h )that can appear anywhere in the first few blocks of a file.For example, the keyword.B .brindicates that the file is most likely a.BR troff (1)input file, just as the keyword .B structindicates a C program.These tests are less reliable than the previoustwo groups, so they are performed last.The language test routines also test for some miscellany(such as .BR tar (1)archives)..PPAny file that cannot be identified as having been writtenin any of the character sets listed above is simply said to be ``data''..SH OPTIONS.TP 8.B "\-b, \-\-brief"Do not prepend filenames to output lines (brief mode)..TP 8.B "\-c, \-\-checking\-printout"Cause a checking printout of the parsed form of the magic file.This is usually used in conjunction with .B \-mto debug a new magic file before installing it..TP 8.B "\-C, \-\-compile"Write a magic.mgc output file that contains a pre-parsed version offile..TP 8.BI "\-f, \-\-files\-from" " namefile"Read the names of the files to be examined from .I namefile(one per line) before the argument list.Either .I namefileor at least one filename argument must be present;to test the standard input, use ``\-'' as a filename argument..TP 8.BI "\-F, \-\-separator" " separator"Use the specified string as the separator between the filename and thefile result returned. Defaults to ``:''..TP 8.B "\-h, \-\-no-dereference"option causes symlinks not to be followed(on systems that support symbolic links). This is the default if theenvironment variable.I POSIXLY_CORRECTis not defined..TP 8.B "\-i, \-\-mime"Causes the file command to output mime type strings rather than the moretraditional human readable ones. Thus it may say``text/plain; charset=us-ascii''ratherthan ``ASCII text''.In order for this option to work, file changes the wayit handles files recognised by the command itself (such as many of thetext file types, directories etc), and makes use of an alternative``magic'' file.(See ``FILES'' section, below)..TP 8.B "\-k, \-\-keep\-going"Don't stop at the first match, keep going. Subsequent matches will beprepended by ``\\012\- ''. (If you want a newline, see ``\-r'' option.).TP 8.B "\-L, \-\-dereference"option causes symlinks to be followed, as the like-named option in.BR ls (1)(on systems that support symbolic links).This is the default if the environment variable.I POSIXLY_CORRECTis defined..TP 8.BI "\-m, \-\-magic\-file" " list"Specify an alternate list of files containing magic numbers.This can be a single file, or a colon-separated list of files.If a compiled magic file is found alongside, it will be used instead.With the \-i or \-\-mime option, the program adds ".mime" to each file name..TP 8.B "\-n, \-\-no\-buffer"Force stdout to be flushed after checking each file.This is only useful if checking a list of files.It is intended to be used by programs that want filetype output from a pipe..TP 8.B "\-N, \-\-no\-pad"Don't pad filenames so that they align in the output..TP 8.B "\-p, \-\-preserve\-date"On systems that support.BR utime (2)or.BR utimes(2),attempt to preserve the access time of files analyzed, to pretend that.BR file (2)never read them..TP 8.B "\-r, \-\-raw"Don't translate unprintable characters to \eooo.Normally.B filetranslates unprintable characters to their octal representation..TP 8.B "\-s, \-\-special\-files"Normally,.B fileonly attempts to read and determine the type of argument files which.BR stat (2)reports are ordinary files.This prevents problems, because reading special files may have peculiarconsequences.Specifying the.BR \-soption causes.B fileto also read argument files which are block or character special files.This is useful for determining the filesystem types of the data in rawdisk partitions, which are block special files.This option also causes.B fileto disregard the file size as reported by.BR stat (2)since on some systems it reports a zero size for raw disk partitions..TP 8.B "\-v, \-\-version"Print the version of the program and exit..TP 8.B "\-z, \-\-uncompress"Try to look inside compressed files..TP 8.B "\-\-help"Print a help message and exit..SH FILES.TP.I __MAGIC__.mgcDefault compiled list of magic numbers.TP.I __MAGIC__Default list of magic numbers.TP.I __MAGIC__.mime.mgcDefault compiled list of magic numbers, used to output mime types whenthe -i option is specified..TP.I __MAGIC__.mimeDefault list of magic numbers, used to output mime types when the -i optionis specified..SH ENVIRONMENTThe environment variable.B MAGICcan be used to set the default magic number file name.If that variable is set, then.B filewill not attempt to open.B $HOME/.magic ..B fileadds ".mime" and/or ".mgc" to the value of this variable as appropriate.The environment variable.B POSIXLY_CORRECTcontrols (on systems that support symbolic links), if.B filewill attempt to follow symlinks or not. If set, then.B filefollows symlink, otherwise it does not. This is also controlledby the.B Land.B hoptions..SH SEE ALSO.BR magic (__FSECTION__)\- description of magic file format..br.BR strings (1), " od" (1), " hexdump(1)"\- tools for examining non-textfiles..SH STANDARDS CONFORMANCEThis program is believed to exceed the System V Interface Definitionof FILE(CMD), as near as one can determine from the vague languagecontained therein. Its behaviour is mostly compatible with the System V program of the same name.This version knows more magic, however, so it will producedifferent (albeit more accurate) output in many cases. .PPThe one significant difference between this version and System Vis that this version treats any white spaceas a delimiter, so that spaces in pattern strings must be escaped.For example,.br>10	string	language impress\ 	(imPRESS data).brin an existing magic file would have to be changed to.br>10	string	language\e impress	(imPRESS data).brIn addition, in this version, if a pattern string contains a backslash,it must be escaped.For example.br0	string		\ebegindata	Andrew Toolkit document.brin an existing magic file would have to be changed to.br0	string		\e\ebegindata	Andrew Toolkit document.br.PPSunOS releases 3.2 and later from Sun Microsystems include a.BR file (1)command derived from the System V one, but with some extensions.My version differs from Sun's only in minor ways.It includes the extension of the `&' operator, used as,for example,.br>16	long&0x7fffffff	>0		not stripped.SH MAGIC DIRECTORYThe magic file entries have been collected from various sources,mainly USENET, and contributed by various authors.Christos Zoulas (address below) will collect additionalor corrected magic file entries.A consolidation of magic file entries will be distributed periodically..PPThe order of entries in the magic file is significant.Depending on what system you are using, the order thatthey are put together may be incorrect..SH EXAMPLES.nf$ file file.c file /dev/{wd0a,hda}file.c:   C program textfile:     ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),          dynamically linked (uses shared libs), stripped/dev/wd0a: block special (0/0)/dev/hda: block special (3/0)$ file -s /dev/wd0{b,d}/dev/wd0b: data/dev/wd0d: x86 boot sector$ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}/dev/hda:   x86 boot sector/dev/hda1:  Linux/i386 ext2 filesystem/dev/hda2:  x86 boot sector/dev/hda3:  x86 boot sector, extended partition table/dev/hda4:  Linux/i386 ext2 filesystem/dev/hda5:  Linux/i386 swap file/dev/hda6:  Linux/i386 swap file/dev/hda7:  Linux/i386 swap file/dev/hda8:  Linux/i386 swap file/dev/hda9:  empty/dev/hda10: empty$ file -i file.c file /dev/{wd0a,hda}file.c:      text/x-cfile:        application/x-executable, dynamically linked (uses shared libs),not stripped/dev/hda:    application/x-not-regular-file/dev/wd0a:   application/x-not-regular-file.fi.SH HISTORYThere has been a .B filecommand in every \s-1UNIX\s0 since at least Research Version 4(man page dated November, 1973).The System V version introduced one significant major change:the external list of magic number types.This slowed the program down slightly but made it a lot more flexible..PPThis program, based on the System V version,was written by Ian Darwin <ian@darwinsys.com>without looking at anybody else's source code..PPJohn Gilmore revised the code extensively, making it better thanthe first version.Geoff Collyer found several inadequaciesand provided some magic file entries.Contributions by the `&' operator by Rob McMahon, cudcv@warwick.ac.uk, 1989..PPGuy Harris, guy@netapp.com, made many changes from 1993 to the present..PPPrimary development and maintenance from 1990 to the present byChristos Zoulas (christos@astron.com)..PPAltered by Chris Lowth, chris@lowth.com, 2000:Handle the ``-i'' option to output mime type strings and using an alternativemagic file and internal logic..PPAltered by Eric Fischer (enf@pobox.com), July, 2000,to identify character codes and attempt to identify the languagesof non-ASCII files..PPThe list of contributors to the "Magdir" directory (source for the.I __MAGIC__file) is too long to include here.You know who you are; thank you..SH LEGAL NOTICECopyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.Covered by the standard Berkeley Software Distribution copyright; see the fileLEGAL.NOTICE in the source distribution..PPThe files.I tar.hand.I is_tar.cwere written by John Gilmore from his public-domain.B tarprogram, and are not covered by the above license..SH BUGSThere must be a better way to automate the construction of the Magicfile from all the glop in magdir.What is it?Better yet, the magic file should be compiled into binary (say,.BR ndbm (3)or, better yet, fixed-length.SM ASCIIstrings for use in heterogenous network environments) for faster startup.Then the program would run as fast as the Version 7 program of the same name,with the flexibility of the System V version..PP.B Fileuses several algorithms that favor speed over accuracy,thus it can be misled about the contents oftextfiles..PPThe support fortextfiles (primarily for programming languages)is simplistic, inefficient and requires recompilation to update..PPThere should be an ``else'' clause to follow a series of continuation lines..PPThe magic file and keywords should have regular expression support.Their use of.SM "ASCII TAB"as a field delimiter is ugly and makesit hard to edit the files, but is entrenched..PPIt might be advisable to allow upper-case letters in keywordsfor e.g.,.BR troff (1)commands vs man page macros.Regular expression support would make this easy..PPThe program doesn't grok \s-2FORTRAN\s0.It should be able to figure \s-2FORTRAN\s0 by seeing some keywords which appear indented at the start of line.Regular expression support would make this easy..PPThe list of keywords in .I ascmagicprobably belongs in the Magic file.This could be done by using some keyword like `*' for the offset value..PPAnother optimisation would be to sortthe magic file so that we can just run down all thetests for the first byte, first word, first long, etc, once wehave fetched it.Complain about conflicts in the magic file entries.Make a rule that the magic entries sort based on file offset ratherthan position within the magic file?.PPThe program should provide a way to give an estimate of ``how good'' a guess is.We end up removing guesses (e.g. ``From '' as first 5 chars of file) becausethey are not as good as other guesses (e.g. ``Newsgroups:'' versus``Return-Path:'').Still, if the others don't pan out, it should be possible to use thefirst guess.  .PPThis program is slower than some vendors' file commands.The new support for multiple character codes makes it even slower..PPThis manual page, and particularly this section, is too long..SH RETURN CODE.B filealmost always returns 0. It returns a different if it cannot open a file..SH AVAILABILITYYou can obtain the original author's latest version by anonymous FTPon.B ftp.astron.comin the directory.I /pub/file/file-X.YZ.tar.gz.PPThis.B Debianversion adds a number of new magix entries. It can beobtained from every site carrying a.B Debiandistribution (ftp.debian.org and mirrors).
 |