libmagic-compat.patch 38 KB


  1. Subject: libmagic compability
  2. Origin: libmagic-compat branch, commit 315cb4c
  3. Upstream-Author: Adam Hupp <adam@hupp.org>
  4. Date: Mon Dec 4 11:55:27 2017 -0800
  5. Last-Updated: 2018-01-15
  6. --- a/LICENSE
  7. +++ b/LICENSE
  8. @@ -19,3 +19,40 @@
  9. LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  10. OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  11. SOFTWARE.
  12. +
  13. +
  14. +====
  15. +
  16. +Portions of this package (magic/compat.py and test/libmagic_test.py)
  17. +are distributed under the following copyright notice:
  18. +
  19. +
  20. +$File: LEGAL.NOTICE,v 1.15 2006/05/03 18:48:33 christos Exp $
  21. +Copyright (c) Ian F. Darwin 1986, 1987, 1989, 1990, 1991, 1992, 1994, 1995.
  22. +Software written by Ian F. Darwin and others;
  23. +maintained 1994- Christos Zoulas.
  24. +
  25. +This software is not subject to any export provision of the United States
  26. +Department of Commerce, and may be exported to any country or planet.
  27. +
  28. +Redistribution and use in source and binary forms, with or without
  29. +modification, are permitted provided that the following conditions
  30. +are met:
  31. +1. Redistributions of source code must retain the above copyright
  32. + notice immediately at the beginning of the file, without modification,
  33. + this list of conditions, and the following disclaimer.
  34. +2. Redistributions in binary form must reproduce the above copyright
  35. + notice, this list of conditions and the following disclaimer in the
  36. + documentation and/or other materials provided with the distribution.
  37. +
  38. +THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  39. +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  40. +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  41. +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
  42. +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  43. +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  44. +OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  45. +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  46. +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  47. +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  48. +SUCH DAMAGE.
  49. --- a/README.md
  50. +++ b/README.md
  51. @@ -45,9 +45,18 @@
  52. Minor version bumps should be backwards compatible. Major bumps are not.
  53. -## Name Conflict
  54. +## Compatibility
  55. -There are, sadly, two libraries which use the module name `magic`. Both have been around for quite a while.If you are using this module and get an error using a method like `open`, your code is expecting the other one. Hopefully one day these will be reconciled.
  56. +There are, sadly, 3 libraries using the package name `magic`. The others are:
  57. +
  58. +1. libmagic itself distributes a `magic` python module with a somewhat
  59. +different API. python-magic includes a copy of this module to avoid
  60. +unnessary breakage when both versions are installed. Maybe someday
  61. +they will converge.
  62. +
  63. +2. python-libmagic also uses the same module name, and has a similar
  64. +but not identical API. If you run into errors about "magic.h" not
  65. +being present, you should uninstall python-libmagic.
  66. ## Installation
  67. @@ -64,7 +73,7 @@
  68. You'll need DLLs for libmagic. @julian-r has uploaded a versoin of this project that includes binaries to pypi:
  69. https://pypi.python.org/pypi/python-magic-bin/0.4.14
  70. -Other sources of the libraries in the past have been [File for Windows](http://gnuwin32.sourceforge.net/packages/file.htm) . You will need to copy the file `magic` out of `[binary-zip]\share\misc`, and pass it's location to `Magic(magic_file=...)`.
  71. +Other sources of the libraries in the past have been [File for Windows](http://gnuwin32.sourceforge.net/packages/file.htm) . You will need to copy the file `magic` out of `[binary-zip]\share\misc`, and pass its location to `Magic(magic_file=...)`.
  72. If you are using a 64-bit build of python, you'll need 64-bit libmagic binaries which can be found here: https://github.com/pidydx/libmagicwin64. Newer version can be found here: https://github.com/nscaife/file-windows.
  73. @@ -86,7 +95,7 @@
  74. Attempting to run the 32-bit libmagic DLL in a 64-bit build of
  75. python will fail with this error. Here are 64-bit builds of libmagic for windows: https://github.com/pidydx/libmagicwin64
  76. -- 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing
  77. +- 'WindowsError: exception: access violation writing 0x00000000 ' This may indicate you are mixing
  78. Windows Python and Cygwin Python. Make sure your libmagic and python builds are consistent.
  79. ## Author
  80. @@ -116,5 +125,3 @@
  81. python-magic is distributed under the MIT license. See the included
  82. LICENSE file for details.
  83. -
  84. -
  85. --- a/magic.py
  86. +++ /dev/null
  87. @@ -1,301 +0,0 @@
  88. -"""
  89. -magic is a wrapper around the libmagic file identification library.
  90. -
  91. -See README for more information.
  92. -
  93. -Usage:
  94. -
  95. ->>> import magic
  96. ->>> magic.from_file("testdata/test.pdf")
  97. -'PDF document, version 1.2'
  98. ->>> magic.from_file("testdata/test.pdf", mime=True)
  99. -'application/pdf'
  100. ->>> magic.from_buffer(open("testdata/test.pdf").read(1024))
  101. -'PDF document, version 1.2'
  102. ->>>
  103. -
  104. -
  105. -"""
  106. -
  107. -import sys
  108. -import glob
  109. -import os.path
  110. -import ctypes
  111. -import ctypes.util
  112. -import threading
  113. -
  114. -from ctypes import c_char_p, c_int, c_size_t, c_void_p
  115. -
  116. -
  117. -class MagicException(Exception):
  118. - def __init__(self, message):
  119. - super(MagicException, self).__init__(message)
  120. - self.message = message
  121. -
  122. -
  123. -class Magic:
  124. - """
  125. - Magic is a wrapper around the libmagic C library.
  126. -
  127. - """
  128. -
  129. - def __init__(self, mime=False, magic_file=None, mime_encoding=False,
  130. - keep_going=False, uncompress=False):
  131. - """
  132. - Create a new libmagic wrapper.
  133. -
  134. - mime - if True, mimetypes are returned instead of textual descriptions
  135. - mime_encoding - if True, codec is returned
  136. - magic_file - use a mime database other than the system default
  137. - keep_going - don't stop at the first match, keep going
  138. - uncompress - Try to look inside compressed files.
  139. - """
  140. - self.flags = MAGIC_NONE
  141. - if mime:
  142. - self.flags |= MAGIC_MIME
  143. - if mime_encoding:
  144. - self.flags |= MAGIC_MIME_ENCODING
  145. - if keep_going:
  146. - self.flags |= MAGIC_CONTINUE
  147. -
  148. - if uncompress:
  149. - self.flags |= MAGIC_COMPRESS
  150. -
  151. - self.cookie = magic_open(self.flags)
  152. - self.lock = threading.Lock()
  153. -
  154. - magic_load(self.cookie, magic_file)
  155. -
  156. - def from_buffer(self, buf):
  157. - """
  158. - Identify the contents of `buf`
  159. - """
  160. - with self.lock:
  161. - try:
  162. - # if we're on python3, convert buf to bytes
  163. - # otherwise this string is passed as wchar*
  164. - # which is not what libmagic expects
  165. - if type(buf) == str and str != bytes:
  166. - buf = buf.encode('utf-8', errors='replace')
  167. - return maybe_decode(magic_buffer(self.cookie, buf))
  168. - except MagicException as e:
  169. - return self._handle509Bug(e)
  170. -
  171. - def from_file(self, filename):
  172. - # raise FileNotFoundException or IOError if the file does not exist
  173. - with open(filename):
  174. - pass
  175. - with self.lock:
  176. - try:
  177. - return maybe_decode(magic_file(self.cookie, filename))
  178. - except MagicException as e:
  179. - return self._handle509Bug(e)
  180. -
  181. - def _handle509Bug(self, e):
  182. - # libmagic 5.09 has a bug where it might fail to identify the
  183. - # mimetype of a file and returns null from magic_file (and
  184. - # likely _buffer), but also does not return an error message.
  185. - if e.message is None and (self.flags & MAGIC_MIME):
  186. - return "application/octet-stream"
  187. - else:
  188. - raise e
  189. -
  190. - def __del__(self):
  191. - # no _thread_check here because there can be no other
  192. - # references to this object at this point.
  193. -
  194. - # during shutdown magic_close may have been cleared already so
  195. - # make sure it exists before using it.
  196. -
  197. - # the self.cookie check should be unnecessary and was an
  198. - # incorrect fix for a threading problem, however I'm leaving
  199. - # it in because it's harmless and I'm slightly afraid to
  200. - # remove it.
  201. - if self.cookie and magic_close:
  202. - magic_close(self.cookie)
  203. - self.cookie = None
  204. -
  205. -_instances = {}
  206. -
  207. -def _get_magic_type(mime):
  208. - i = _instances.get(mime)
  209. - if i is None:
  210. - i = _instances[mime] = Magic(mime=mime)
  211. - return i
  212. -
  213. -def from_file(filename, mime=False):
  214. - """"
  215. - Accepts a filename and returns the detected filetype. Return
  216. - value is the mimetype if mime=True, otherwise a human readable
  217. - name.
  218. -
  219. - >>> magic.from_file("testdata/test.pdf", mime=True)
  220. - 'application/pdf'
  221. - """
  222. - m = _get_magic_type(mime)
  223. - return m.from_file(filename)
  224. -
  225. -def from_buffer(buffer, mime=False):
  226. - """
  227. - Accepts a binary string and returns the detected filetype. Return
  228. - value is the mimetype if mime=True, otherwise a human readable
  229. - name.
  230. -
  231. - >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
  232. - 'PDF document, version 1.2'
  233. - """
  234. - m = _get_magic_type(mime)
  235. - return m.from_buffer(buffer)
  236. -
  237. -
  238. -
  239. -
  240. -libmagic = None
  241. -# Let's try to find magic or magic1
  242. -dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1') or ctypes.util.find_library('cygmagic-1')
  243. -
  244. -# This is necessary because find_library returns None if it doesn't find the library
  245. -if dll:
  246. - libmagic = ctypes.CDLL(dll)
  247. -
  248. -if not libmagic or not libmagic._name:
  249. - windows_dlls = ['magic1.dll','cygmagic-1.dll']
  250. - platform_to_lib = {'darwin': ['/opt/local/lib/libmagic.dylib',
  251. - '/usr/local/lib/libmagic.dylib'] +
  252. - # Assumes there will only be one version installed
  253. - glob.glob('/usr/local/Cellar/libmagic/*/lib/libmagic.dylib'),
  254. - 'win32': windows_dlls,
  255. - 'cygwin': windows_dlls,
  256. - 'linux': ['libmagic.so.1'], # fallback for some Linuxes (e.g. Alpine) where library search does not work
  257. - }
  258. - platform = 'linux' if sys.platform.startswith('linux') else sys.platform
  259. - for dll in platform_to_lib.get(platform, []):
  260. - try:
  261. - libmagic = ctypes.CDLL(dll)
  262. - break
  263. - except OSError:
  264. - pass
  265. -
  266. -if not libmagic or not libmagic._name:
  267. - # It is better to raise an ImportError since we are importing magic module
  268. - raise ImportError('failed to find libmagic. Check your installation')
  269. -
  270. -magic_t = ctypes.c_void_p
  271. -
  272. -def errorcheck_null(result, func, args):
  273. - if result is None:
  274. - err = magic_error(args[0])
  275. - raise MagicException(err)
  276. - else:
  277. - return result
  278. -
  279. -def errorcheck_negative_one(result, func, args):
  280. - if result is -1:
  281. - err = magic_error(args[0])
  282. - raise MagicException(err)
  283. - else:
  284. - return result
  285. -
  286. -
  287. -# return str on python3. Don't want to unconditionally
  288. -# decode because that results in unicode on python2
  289. -def maybe_decode(s):
  290. - if str == bytes:
  291. - return s
  292. - else:
  293. - return s.decode('utf-8')
  294. -
  295. -def coerce_filename(filename):
  296. - if filename is None:
  297. - return None
  298. -
  299. - # ctypes will implicitly convert unicode strings to bytes with
  300. - # .encode('ascii'). If you use the filesystem encoding
  301. - # then you'll get inconsistent behavior (crashes) depending on the user's
  302. - # LANG environment variable
  303. - is_unicode = (sys.version_info[0] <= 2 and
  304. - isinstance(filename, unicode)) or \
  305. - (sys.version_info[0] >= 3 and
  306. - isinstance(filename, str))
  307. - if is_unicode:
  308. - return filename.encode('utf-8', 'surrogateescape')
  309. - else:
  310. - return filename
  311. -
  312. -magic_open = libmagic.magic_open
  313. -magic_open.restype = magic_t
  314. -magic_open.argtypes = [c_int]
  315. -
  316. -magic_close = libmagic.magic_close
  317. -magic_close.restype = None
  318. -magic_close.argtypes = [magic_t]
  319. -
  320. -magic_error = libmagic.magic_error
  321. -magic_error.restype = c_char_p
  322. -magic_error.argtypes = [magic_t]
  323. -
  324. -magic_errno = libmagic.magic_errno
  325. -magic_errno.restype = c_int
  326. -magic_errno.argtypes = [magic_t]
  327. -
  328. -_magic_file = libmagic.magic_file
  329. -_magic_file.restype = c_char_p
  330. -_magic_file.argtypes = [magic_t, c_char_p]
  331. -_magic_file.errcheck = errorcheck_null
  332. -
  333. -def magic_file(cookie, filename):
  334. - return _magic_file(cookie, coerce_filename(filename))
  335. -
  336. -_magic_buffer = libmagic.magic_buffer
  337. -_magic_buffer.restype = c_char_p
  338. -_magic_buffer.argtypes = [magic_t, c_void_p, c_size_t]
  339. -_magic_buffer.errcheck = errorcheck_null
  340. -
  341. -def magic_buffer(cookie, buf):
  342. - return _magic_buffer(cookie, buf, len(buf))
  343. -
  344. -
  345. -_magic_load = libmagic.magic_load
  346. -_magic_load.restype = c_int
  347. -_magic_load.argtypes = [magic_t, c_char_p]
  348. -_magic_load.errcheck = errorcheck_negative_one
  349. -
  350. -def magic_load(cookie, filename):
  351. - return _magic_load(cookie, coerce_filename(filename))
  352. -
  353. -magic_setflags = libmagic.magic_setflags
  354. -magic_setflags.restype = c_int
  355. -magic_setflags.argtypes = [magic_t, c_int]
  356. -
  357. -magic_check = libmagic.magic_check
  358. -magic_check.restype = c_int
  359. -magic_check.argtypes = [magic_t, c_char_p]
  360. -
  361. -magic_compile = libmagic.magic_compile
  362. -magic_compile.restype = c_int
  363. -magic_compile.argtypes = [magic_t, c_char_p]
  364. -
  365. -
  366. -
  367. -MAGIC_NONE = 0x000000 # No flags
  368. -MAGIC_DEBUG = 0x000001 # Turn on debugging
  369. -MAGIC_SYMLINK = 0x000002 # Follow symlinks
  370. -MAGIC_COMPRESS = 0x000004 # Check inside compressed files
  371. -MAGIC_DEVICES = 0x000008 # Look at the contents of devices
  372. -MAGIC_MIME = 0x000010 # Return a mime string
  373. -MAGIC_MIME_ENCODING = 0x000400 # Return the MIME encoding
  374. -MAGIC_CONTINUE = 0x000020 # Return all matches
  375. -MAGIC_CHECK = 0x000040 # Print warnings to stderr
  376. -MAGIC_PRESERVE_ATIME = 0x000080 # Restore access time on exit
  377. -MAGIC_RAW = 0x000100 # Don't translate unprintable chars
  378. -MAGIC_ERROR = 0x000200 # Handle ENOENT etc as real errors
  379. -
  380. -MAGIC_NO_CHECK_COMPRESS = 0x001000 # Don't check for compressed files
  381. -MAGIC_NO_CHECK_TAR = 0x002000 # Don't check for tar files
  382. -MAGIC_NO_CHECK_SOFT = 0x004000 # Don't check magic entries
  383. -MAGIC_NO_CHECK_APPTYPE = 0x008000 # Don't check application type
  384. -MAGIC_NO_CHECK_ELF = 0x010000 # Don't check for elf details
  385. -MAGIC_NO_CHECK_ASCII = 0x020000 # Don't check for ascii files
  386. -MAGIC_NO_CHECK_TROFF = 0x040000 # Don't check ascii/troff
  387. -MAGIC_NO_CHECK_FORTRAN = 0x080000 # Don't check ascii/fortran
  388. -MAGIC_NO_CHECK_TOKENS = 0x100000 # Don't check ascii/tokens
  389. --- /dev/null
  390. +++ b/magic/__init__.py
  391. @@ -0,0 +1,361 @@
  392. +"""
  393. +magic is a wrapper around the libmagic file identification library.
  394. +
  395. +See README for more information.
  396. +
  397. +Usage:
  398. +
  399. +>>> import magic
  400. +>>> magic.from_file("testdata/test.pdf")
  401. +'PDF document, version 1.2'
  402. +>>> magic.from_file("testdata/test.pdf", mime=True)
  403. +'application/pdf'
  404. +>>> magic.from_buffer(open("testdata/test.pdf").read(1024))
  405. +'PDF document, version 1.2'
  406. +>>>
  407. +
  408. +
  409. +"""
  410. +
  411. +import sys
  412. +import glob
  413. +import os.path
  414. +import ctypes
  415. +import ctypes.util
  416. +import threading
  417. +import logging
  418. +
  419. +from ctypes import c_char_p, c_int, c_size_t, c_void_p
  420. +
  421. +# avoid shadowing the real open with the version from compat.py
  422. +_real_open = open
  423. +
  424. +class MagicException(Exception):
  425. + def __init__(self, message):
  426. + super(MagicException, self).__init__(message)
  427. + self.message = message
  428. +
  429. +
  430. +class Magic:
  431. + """
  432. + Magic is a wrapper around the libmagic C library.
  433. +
  434. + """
  435. +
  436. + def __init__(self, mime=False, magic_file=None, mime_encoding=False,
  437. + keep_going=False, uncompress=False):
  438. + """
  439. + Create a new libmagic wrapper.
  440. +
  441. + mime - if True, mimetypes are returned instead of textual descriptions
  442. + mime_encoding - if True, codec is returned
  443. + magic_file - use a mime database other than the system default
  444. + keep_going - don't stop at the first match, keep going
  445. + uncompress - Try to look inside compressed files.
  446. + """
  447. + self.flags = MAGIC_NONE
  448. + if mime:
  449. + self.flags |= MAGIC_MIME
  450. + if mime_encoding:
  451. + self.flags |= MAGIC_MIME_ENCODING
  452. + if keep_going:
  453. + self.flags |= MAGIC_CONTINUE
  454. +
  455. + if uncompress:
  456. + self.flags |= MAGIC_COMPRESS
  457. +
  458. + self.cookie = magic_open(self.flags)
  459. + self.lock = threading.Lock()
  460. +
  461. + magic_load(self.cookie, magic_file)
  462. +
  463. + def from_buffer(self, buf):
  464. + """
  465. + Identify the contents of `buf`
  466. + """
  467. + with self.lock:
  468. + try:
  469. + # if we're on python3, convert buf to bytes
  470. + # otherwise this string is passed as wchar*
  471. + # which is not what libmagic expects
  472. + if type(buf) == str and str != bytes:
  473. + buf = buf.encode('utf-8', errors='replace')
  474. + return maybe_decode(magic_buffer(self.cookie, buf))
  475. + except MagicException as e:
  476. + return self._handle509Bug(e)
  477. +
  478. + def from_open_file(self, open_file):
  479. + with self.lock:
  480. + try:
  481. + return maybe_decode(magic_descriptor(self.cookie, open_file.fileno()))
  482. + except MagicException as e:
  483. + return self._handle509Bug(e)
  484. +
  485. + def from_file(self, filename):
  486. + # raise FileNotFoundException or IOError if the file does not exist
  487. + with _real_open(filename):
  488. + pass
  489. +
  490. + with self.lock:
  491. + try:
  492. + return maybe_decode(magic_file(self.cookie, filename))
  493. + except MagicException as e:
  494. + return self._handle509Bug(e)
  495. +
  496. + def _handle509Bug(self, e):
  497. + # libmagic 5.09 has a bug where it might fail to identify the
  498. + # mimetype of a file and returns null from magic_file (and
  499. + # likely _buffer), but also does not return an error message.
  500. + if e.message is None and (self.flags & MAGIC_MIME):
  501. + return "application/octet-stream"
  502. + else:
  503. + raise e
  504. +
  505. + def __del__(self):
  506. + # no _thread_check here because there can be no other
  507. + # references to this object at this point.
  508. +
  509. + # during shutdown magic_close may have been cleared already so
  510. + # make sure it exists before using it.
  511. +
  512. + # the self.cookie check should be unnecessary and was an
  513. + # incorrect fix for a threading problem, however I'm leaving
  514. + # it in because it's harmless and I'm slightly afraid to
  515. + # remove it.
  516. + if self.cookie and magic_close:
  517. + magic_close(self.cookie)
  518. + self.cookie = None
  519. +
  520. +_instances = {}
  521. +
  522. +def _get_magic_type(mime):
  523. + i = _instances.get(mime)
  524. + if i is None:
  525. + i = _instances[mime] = Magic(mime=mime)
  526. + return i
  527. +
  528. +def from_file(filename, mime=False):
  529. + """"
  530. + Accepts a filename and returns the detected filetype. Return
  531. + value is the mimetype if mime=True, otherwise a human readable
  532. + name.
  533. +
  534. + >>> magic.from_file("testdata/test.pdf", mime=True)
  535. + 'application/pdf'
  536. + """
  537. + m = _get_magic_type(mime)
  538. + return m.from_file(filename)
  539. +
  540. +def from_buffer(buffer, mime=False):
  541. + """
  542. + Accepts a binary string and returns the detected filetype. Return
  543. + value is the mimetype if mime=True, otherwise a human readable
  544. + name.
  545. +
  546. + >>> magic.from_buffer(open("testdata/test.pdf").read(1024))
  547. + 'PDF document, version 1.2'
  548. + """
  549. + m = _get_magic_type(mime)
  550. + return m.from_buffer(buffer)
  551. +
  552. +
  553. +
  554. +
  555. +libmagic = None
  556. +# Let's try to find magic or magic1
  557. +dll = ctypes.util.find_library('magic') or ctypes.util.find_library('magic1') or ctypes.util.find_library('cygmagic-1')
  558. +
  559. +# This is necessary because find_library returns None if it doesn't find the library
  560. +if dll:
  561. + libmagic = ctypes.CDLL(dll)
  562. +
  563. +if not libmagic or not libmagic._name:
  564. + windows_dlls = ['magic1.dll','cygmagic-1.dll']
  565. + platform_to_lib = {'darwin': ['/opt/local/lib/libmagic.dylib',
  566. + '/usr/local/lib/libmagic.dylib'] +
  567. + # Assumes there will only be one version installed
  568. + glob.glob('/usr/local/Cellar/libmagic/*/lib/libmagic.dylib'),
  569. + 'win32': windows_dlls,
  570. + 'cygwin': windows_dlls,
  571. + 'linux': ['libmagic.so.1'], # fallback for some Linuxes (e.g. Alpine) where library search does not work
  572. + }
  573. + platform = 'linux' if sys.platform.startswith('linux') else sys.platform
  574. + for dll in platform_to_lib.get(platform, []):
  575. + try:
  576. + libmagic = ctypes.CDLL(dll)
  577. + break
  578. + except OSError:
  579. + pass
  580. +
  581. +if not libmagic or not libmagic._name:
  582. + # It is better to raise an ImportError since we are importing magic module
  583. + raise ImportError('failed to find libmagic. Check your installation')
  584. +
  585. +magic_t = ctypes.c_void_p
  586. +
  587. +def errorcheck_null(result, func, args):
  588. + if result is None:
  589. + err = magic_error(args[0])
  590. + raise MagicException(err)
  591. + else:
  592. + return result
  593. +
  594. +def errorcheck_negative_one(result, func, args):
  595. + if result is -1:
  596. + err = magic_error(args[0])
  597. + raise MagicException(err)
  598. + else:
  599. + return result
  600. +
  601. +
  602. +# return str on python3. Don't want to unconditionally
  603. +# decode because that results in unicode on python2
  604. +def maybe_decode(s):
  605. + if str == bytes:
  606. + return s
  607. + else:
  608. + return s.decode('utf-8')
  609. +
  610. +def coerce_filename(filename):
  611. + if filename is None:
  612. + return None
  613. +
  614. + # ctypes will implicitly convert unicode strings to bytes with
  615. + # .encode('ascii'). If you use the filesystem encoding
  616. + # then you'll get inconsistent behavior (crashes) depending on the user's
  617. + # LANG environment variable
  618. + is_unicode = (sys.version_info[0] <= 2 and
  619. + isinstance(filename, unicode)) or \
  620. + (sys.version_info[0] >= 3 and
  621. + isinstance(filename, str))
  622. + if is_unicode:
  623. + return filename.encode('utf-8', 'surrogateescape')
  624. + else:
  625. + return filename
  626. +
  627. +magic_open = libmagic.magic_open
  628. +magic_open.restype = magic_t
  629. +magic_open.argtypes = [c_int]
  630. +
  631. +magic_close = libmagic.magic_close
  632. +magic_close.restype = None
  633. +magic_close.argtypes = [magic_t]
  634. +
  635. +magic_error = libmagic.magic_error
  636. +magic_error.restype = c_char_p
  637. +magic_error.argtypes = [magic_t]
  638. +
  639. +magic_errno = libmagic.magic_errno
  640. +magic_errno.restype = c_int
  641. +magic_errno.argtypes = [magic_t]
  642. +
  643. +_magic_file = libmagic.magic_file
  644. +_magic_file.restype = c_char_p
  645. +_magic_file.argtypes = [magic_t, c_char_p]
  646. +_magic_file.errcheck = errorcheck_null
  647. +
  648. +def magic_file(cookie, filename):
  649. + return _magic_file(cookie, coerce_filename(filename))
  650. +
  651. +_magic_buffer = libmagic.magic_buffer
  652. +_magic_buffer.restype = c_char_p
  653. +_magic_buffer.argtypes = [magic_t, c_void_p, c_size_t]
  654. +_magic_buffer.errcheck = errorcheck_null
  655. +
  656. +def magic_buffer(cookie, buf):
  657. + return _magic_buffer(cookie, buf, len(buf))
  658. +
  659. +magic_descriptor = libmagic.magic_descriptor
  660. +magic_descriptor.restype = c_char_p
  661. +magic_descriptor.argtypes = [magic_t, c_int]
  662. +magic_descriptor.errcheck = errorcheck_null
  663. +
  664. +_magic_load = libmagic.magic_load
  665. +_magic_load.restype = c_int
  666. +_magic_load.argtypes = [magic_t, c_char_p]
  667. +_magic_load.errcheck = errorcheck_negative_one
  668. +
  669. +def magic_load(cookie, filename):
  670. + return _magic_load(cookie, coerce_filename(filename))
  671. +
  672. +magic_setflags = libmagic.magic_setflags
  673. +magic_setflags.restype = c_int
  674. +magic_setflags.argtypes = [magic_t, c_int]
  675. +
  676. +magic_check = libmagic.magic_check
  677. +magic_check.restype = c_int
  678. +magic_check.argtypes = [magic_t, c_char_p]
  679. +
  680. +magic_compile = libmagic.magic_compile
  681. +magic_compile.restype = c_int
  682. +magic_compile.argtypes = [magic_t, c_char_p]
  683. +
  684. +
  685. +
  686. +MAGIC_NONE = 0x000000 # No flags
  687. +MAGIC_DEBUG = 0x000001 # Turn on debugging
  688. +MAGIC_SYMLINK = 0x000002 # Follow symlinks
  689. +MAGIC_COMPRESS = 0x000004 # Check inside compressed files
  690. +MAGIC_DEVICES = 0x000008 # Look at the contents of devices
  691. +MAGIC_MIME = 0x000010 # Return a mime string
  692. +MAGIC_MIME_ENCODING = 0x000400 # Return the MIME encoding
  693. +MAGIC_CONTINUE = 0x000020 # Return all matches
  694. +MAGIC_CHECK = 0x000040 # Print warnings to stderr
  695. +MAGIC_PRESERVE_ATIME = 0x000080 # Restore access time on exit
  696. +MAGIC_RAW = 0x000100 # Don't translate unprintable chars
  697. +MAGIC_ERROR = 0x000200 # Handle ENOENT etc as real errors
  698. +
  699. +MAGIC_NO_CHECK_COMPRESS = 0x001000 # Don't check for compressed files
  700. +MAGIC_NO_CHECK_TAR = 0x002000 # Don't check for tar files
  701. +MAGIC_NO_CHECK_SOFT = 0x004000 # Don't check magic entries
  702. +MAGIC_NO_CHECK_APPTYPE = 0x008000 # Don't check application type
  703. +MAGIC_NO_CHECK_ELF = 0x010000 # Don't check for elf details
  704. +MAGIC_NO_CHECK_ASCII = 0x020000 # Don't check for ascii files
  705. +MAGIC_NO_CHECK_TROFF = 0x040000 # Don't check ascii/troff
  706. +MAGIC_NO_CHECK_FORTRAN = 0x080000 # Don't check ascii/fortran
  707. +MAGIC_NO_CHECK_TOKENS = 0x100000 # Don't check ascii/tokens
  708. +
  709. +# This package name conflicts with the one provided by upstream
  710. +# libmagic. This is a common source of confusion for users. To
  711. +# resolve, We ship a copy of that module, and expose it's functions
  712. +# wrapped in deprecation warnings.
  713. +def add_compat(to_module):
  714. +
  715. + import warnings, re
  716. + from magic import compat
  717. +
  718. + def deprecation_wrapper(compat, fn, alternate):
  719. + def _(*args, **kwargs):
  720. + warnings.warn(
  721. + "Using compatability mode with libmagic's python binding",
  722. + DeprecationWarning)
  723. +
  724. + return compat[fn](*args, **kwargs)
  725. + return _
  726. +
  727. + fn = [('detect_from_filename', 'magic.from_file'),
  728. + ('detect_from_content', 'magic.from_buffer'),
  729. + ('detect_from_fobj', 'magic.Magic.from_open_file'),
  730. + ('open', 'magic.Magic')]
  731. + for (fname, alternate) in fn:
  732. + # for now, disable the deprecation warning until theres clarity on
  733. + # what the merged module should look like
  734. + to_module[fname] = compat.__dict__.get(fname)
  735. + #to_module[fname] = deprecation_wrapper(compat.__dict__, fname, alternate)
  736. +
  737. + # copy constants over, ensuring there's no conflicts
  738. + is_const_re = re.compile("^[A-Z_]+$")
  739. + allowed_inconsistent = set(['MAGIC_MIME'])
  740. + for name, value in compat.__dict__.items():
  741. + if is_const_re.match(name):
  742. + if name in to_module:
  743. + if name in allowed_inconsistent:
  744. + continue
  745. + if to_module[name] != value:
  746. + raise Exception("inconsistent value for " + name)
  747. + else:
  748. + continue
  749. + else:
  750. + to_module[name] = value
  751. +
  752. +add_compat(globals())
  753. --- /dev/null
  754. +++ b/magic/compat.py
  755. @@ -0,0 +1,285 @@
  756. +# coding: utf-8
  757. +
  758. +'''
  759. +Python bindings for libmagic
  760. +'''
  761. +
  762. +import ctypes
  763. +
  764. +from collections import namedtuple
  765. +
  766. +from ctypes import *
  767. +from ctypes.util import find_library
  768. +
  769. +
  770. +def _init():
  771. + """
  772. + Loads the shared library through ctypes and returns a library
  773. + L{ctypes.CDLL} instance
  774. + """
  775. + return ctypes.cdll.LoadLibrary(find_library('magic'))
  776. +
  777. +_libraries = {}
  778. +_libraries['magic'] = _init()
  779. +
  780. +# Flag constants for open and setflags
  781. +MAGIC_NONE = NONE = 0
  782. +MAGIC_DEBUG = DEBUG = 1
  783. +MAGIC_SYMLINK = SYMLINK = 2
  784. +MAGIC_COMPRESS = COMPRESS = 4
  785. +MAGIC_DEVICES = DEVICES = 8
  786. +MAGIC_MIME_TYPE = MIME_TYPE = 16
  787. +MAGIC_CONTINUE = CONTINUE = 32
  788. +MAGIC_CHECK = CHECK = 64
  789. +MAGIC_PRESERVE_ATIME = PRESERVE_ATIME = 128
  790. +MAGIC_RAW = RAW = 256
  791. +MAGIC_ERROR = ERROR = 512
  792. +MAGIC_MIME_ENCODING = MIME_ENCODING = 1024
  793. +MAGIC_MIME = MIME = 1040 # MIME_TYPE + MIME_ENCODING
  794. +MAGIC_APPLE = APPLE = 2048
  795. +
  796. +MAGIC_NO_CHECK_COMPRESS = NO_CHECK_COMPRESS = 4096
  797. +MAGIC_NO_CHECK_TAR = NO_CHECK_TAR = 8192
  798. +MAGIC_NO_CHECK_SOFT = NO_CHECK_SOFT = 16384
  799. +MAGIC_NO_CHECK_APPTYPE = NO_CHECK_APPTYPE = 32768
  800. +MAGIC_NO_CHECK_ELF = NO_CHECK_ELF = 65536
  801. +MAGIC_NO_CHECK_TEXT = NO_CHECK_TEXT = 131072
  802. +MAGIC_NO_CHECK_CDF = NO_CHECK_CDF = 262144
  803. +MAGIC_NO_CHECK_TOKENS = NO_CHECK_TOKENS = 1048576
  804. +MAGIC_NO_CHECK_ENCODING = NO_CHECK_ENCODING = 2097152
  805. +
  806. +MAGIC_NO_CHECK_BUILTIN = NO_CHECK_BUILTIN = 4173824
  807. +
  808. +FileMagic = namedtuple('FileMagic', ('mime_type', 'encoding', 'name'))
  809. +
  810. +
  811. +class magic_set(Structure):
  812. + pass
  813. +magic_set._fields_ = []
  814. +magic_t = POINTER(magic_set)
  815. +
  816. +_open = _libraries['magic'].magic_open
  817. +_open.restype = magic_t
  818. +_open.argtypes = [c_int]
  819. +
  820. +_close = _libraries['magic'].magic_close
  821. +_close.restype = None
  822. +_close.argtypes = [magic_t]
  823. +
  824. +_file = _libraries['magic'].magic_file
  825. +_file.restype = c_char_p
  826. +_file.argtypes = [magic_t, c_char_p]
  827. +
  828. +_descriptor = _libraries['magic'].magic_descriptor
  829. +_descriptor.restype = c_char_p
  830. +_descriptor.argtypes = [magic_t, c_int]
  831. +
  832. +_buffer = _libraries['magic'].magic_buffer
  833. +_buffer.restype = c_char_p
  834. +_buffer.argtypes = [magic_t, c_void_p, c_size_t]
  835. +
  836. +_error = _libraries['magic'].magic_error
  837. +_error.restype = c_char_p
  838. +_error.argtypes = [magic_t]
  839. +
  840. +_setflags = _libraries['magic'].magic_setflags
  841. +_setflags.restype = c_int
  842. +_setflags.argtypes = [magic_t, c_int]
  843. +
  844. +_load = _libraries['magic'].magic_load
  845. +_load.restype = c_int
  846. +_load.argtypes = [magic_t, c_char_p]
  847. +
  848. +_compile = _libraries['magic'].magic_compile
  849. +_compile.restype = c_int
  850. +_compile.argtypes = [magic_t, c_char_p]
  851. +
  852. +_check = _libraries['magic'].magic_check
  853. +_check.restype = c_int
  854. +_check.argtypes = [magic_t, c_char_p]
  855. +
  856. +_list = _libraries['magic'].magic_list
  857. +_list.restype = c_int
  858. +_list.argtypes = [magic_t, c_char_p]
  859. +
  860. +_errno = _libraries['magic'].magic_errno
  861. +_errno.restype = c_int
  862. +_errno.argtypes = [magic_t]
  863. +
  864. +
  865. +class Magic(object):
  866. + def __init__(self, ms):
  867. + self._magic_t = ms
  868. +
  869. + def close(self):
  870. + """
  871. + Closes the magic database and deallocates any resources used.
  872. + """
  873. + _close(self._magic_t)
  874. +
  875. + @staticmethod
  876. + def __tostr(s):
  877. + if s is None:
  878. + return None
  879. + if isinstance(s, str):
  880. + return s
  881. + try: # keep Python 2 compatibility
  882. + return str(s, 'utf-8')
  883. + except TypeError:
  884. + return str(s)
  885. +
  886. + @staticmethod
  887. + def __tobytes(b):
  888. + if b is None:
  889. + return None
  890. + if isinstance(b, bytes):
  891. + return b
  892. + try: # keep Python 2 compatibility
  893. + return bytes(b, 'utf-8')
  894. + except TypeError:
  895. + return bytes(b)
  896. +
  897. + def file(self, filename):
  898. + """
  899. + Returns a textual description of the contents of the argument passed
  900. + as a filename or None if an error occurred and the MAGIC_ERROR flag
  901. + is set. A call to errno() will return the numeric error code.
  902. + """
  903. + return Magic.__tostr(_file(self._magic_t, Magic.__tobytes(filename)))
  904. +
  905. + def descriptor(self, fd):
  906. + """
  907. + Returns a textual description of the contents of the argument passed
  908. + as a file descriptor or None if an error occurred and the MAGIC_ERROR
  909. + flag is set. A call to errno() will return the numeric error code.
  910. + """
  911. + return Magic.__tostr(_descriptor(self._magic_t, fd))
  912. +
  913. + def buffer(self, buf):
  914. + """
  915. + Returns a textual description of the contents of the argument passed
  916. + as a buffer or None if an error occurred and the MAGIC_ERROR flag
  917. + is set. A call to errno() will return the numeric error code.
  918. + """
  919. + return Magic.__tostr(_buffer(self._magic_t, buf, len(buf)))
  920. +
  921. + def error(self):
  922. + """
  923. + Returns a textual explanation of the last error or None
  924. + if there was no error.
  925. + """
  926. + return Magic.__tostr(_error(self._magic_t))
  927. +
  928. + def setflags(self, flags):
  929. + """
  930. + Set flags on the magic object which determine how magic checking
  931. + behaves; a bitwise OR of the flags described in libmagic(3), but
  932. + without the MAGIC_ prefix.
  933. +
  934. + Returns -1 on systems that don't support utime(2) or utimes(2)
  935. + when PRESERVE_ATIME is set.
  936. + """
  937. + return _setflags(self._magic_t, flags)
  938. +
  939. + def load(self, filename=None):
  940. + """
  941. + Must be called to load entries in the colon separated list of database
  942. + files passed as argument or the default database file if no argument
  943. + before any magic queries can be performed.
  944. +
  945. + Returns 0 on success and -1 on failure.
  946. + """
  947. + return _load(self._magic_t, Magic.__tobytes(filename))
  948. +
  949. + def compile(self, dbs):
  950. + """
  951. + Compile entries in the colon separated list of database files
  952. + passed as argument or the default database file if no argument.
  953. + The compiled files created are named from the basename(1) of each file
  954. + argument with ".mgc" appended to it.
  955. +
  956. + Returns 0 on success and -1 on failure.
  957. + """
  958. + return _compile(self._magic_t, Magic.__tobytes(dbs))
  959. +
  960. + def check(self, dbs):
  961. + """
  962. + Check the validity of entries in the colon separated list of
  963. + database files passed as argument or the default database file
  964. + if no argument.
  965. +
  966. + Returns 0 on success and -1 on failure.
  967. + """
  968. + return _check(self._magic_t, Magic.__tobytes(dbs))
  969. +
  970. + def list(self, dbs):
  971. + """
  972. + Check the validity of entries in the colon separated list of
  973. + database files passed as argument or the default database file
  974. + if no argument.
  975. +
  976. + Returns 0 on success and -1 on failure.
  977. + """
  978. + return _list(self._magic_t, Magic.__tobytes(dbs))
  979. +
  980. + def errno(self):
  981. + """
  982. + Returns a numeric error code. If return value is 0, an internal
  983. + magic error occurred. If return value is non-zero, the value is
  984. + an OS error code. Use the errno module or os.strerror() can be used
  985. + to provide detailed error information.
  986. + """
  987. + return _errno(self._magic_t)
  988. +
  989. +
  990. +def open(flags):
  991. + """
  992. + Returns a magic object on success and None on failure.
  993. + Flags argument as for setflags.
  994. + """
  995. + return Magic(_open(flags))
  996. +
  997. +
  998. +# Objects used by `detect_from_` functions
  999. +mime_magic = Magic(_open(MAGIC_MIME))
  1000. +mime_magic.load()
  1001. +none_magic = Magic(_open(MAGIC_NONE))
  1002. +none_magic.load()
  1003. +
  1004. +
  1005. +def _create_filemagic(mime_detected, type_detected):
  1006. + mime_type, mime_encoding = mime_detected.split('; ')
  1007. +
  1008. + return FileMagic(name=type_detected, mime_type=mime_type,
  1009. + encoding=mime_encoding.replace('charset=', ''))
  1010. +
  1011. +
  1012. +def detect_from_filename(filename):
  1013. + '''Detect mime type, encoding and file type from a filename
  1014. +
  1015. + Returns a `FileMagic` namedtuple.
  1016. + '''
  1017. +
  1018. + return _create_filemagic(mime_magic.file(filename),
  1019. + none_magic.file(filename))
  1020. +
  1021. +
  1022. +def detect_from_fobj(fobj):
  1023. + '''Detect mime type, encoding and file type from file-like object
  1024. +
  1025. + Returns a `FileMagic` namedtuple.
  1026. + '''
  1027. +
  1028. + file_descriptor = fobj.fileno()
  1029. + return _create_filemagic(mime_magic.descriptor(file_descriptor),
  1030. + none_magic.descriptor(file_descriptor))
  1031. +
  1032. +
  1033. +def detect_from_content(byte_content):
  1034. + '''Detect mime type, encoding and file type from bytes
  1035. +
  1036. + Returns a `FileMagic` namedtuple.
  1037. + '''
  1038. +
  1039. + return _create_filemagic(mime_magic.buffer(byte_content),
  1040. + none_magic.buffer(byte_content))
  1041. --- a/setup.py
  1042. +++ b/setup.py
  1043. @@ -8,8 +8,8 @@
  1044. author='Adam Hupp',
  1045. author_email='adam@hupp.org',
  1046. url="http://github.com/ahupp/python-magic",
  1047. - version='0.4.15',
  1048. - py_modules=['magic'],
  1049. + version='0.4.16',
  1050. + packages=['magic'],
  1051. long_description="""This module uses ctypes to access the libmagic file type
  1052. identification library. It makes use of the local magic database and
  1053. supports both textual and MIME-type output.
  1054. --- /dev/null
  1055. +++ b/test/libmagic_test.py
  1056. @@ -0,0 +1,39 @@
  1057. +# coding: utf-8
  1058. +
  1059. +import unittest
  1060. +
  1061. +import magic
  1062. +
  1063. +
  1064. +class MagicTestCase(unittest.TestCase):
  1065. +
  1066. + filename = 'test/testdata/test.pdf'
  1067. + expected_mime_type = 'application/pdf'
  1068. + expected_encoding = 'us-ascii'
  1069. + expected_name = 'PDF document, version 1.2'
  1070. +
  1071. + def assert_result(self, result):
  1072. + self.assertEqual(result.mime_type, self.expected_mime_type)
  1073. + self.assertEqual(result.encoding, self.expected_encoding)
  1074. + self.assertEqual(result.name, self.expected_name)
  1075. +
  1076. + def test_detect_from_filename(self):
  1077. + result = magic.detect_from_filename(self.filename)
  1078. + self.assert_result(result)
  1079. +
  1080. + def test_detect_from_fobj(self):
  1081. + with open(self.filename) as fobj:
  1082. + result = magic.detect_from_fobj(fobj)
  1083. + self.assert_result(result)
  1084. +
  1085. + def test_detect_from_content(self):
  1086. + # differ from upstream by opening file in binary mode,
  1087. + # this avoids hitting a bug in python3+libfile bindings
  1088. + # see https://github.com/ahupp/python-magic/issues/152
  1089. + # for a similar issue
  1090. + with open(self.filename, 'rb') as fobj:
  1091. + result = magic.detect_from_content(fobj.read(4096))
  1092. + self.assert_result(result)
  1093. +
  1094. +if __name__ == '__main__':
  1095. + unittest.main()
  1096. --- a/test/run.sh
  1097. +++ b/test/run.sh
  1098. @@ -8,7 +8,10 @@
  1099. echo "python2.6"
  1100. python2.6 ${THISDIR}/test.py
  1101. +python2.6 ${THISDIR}/libmagic_test.py
  1102. echo "python2.7"
  1103. python2.7 ${THISDIR}/test.py
  1104. -echo "python3.0"
  1105. +python2.7 ${THISDIR}/libmagic_test.py
  1106. +echo "python3"
  1107. python3 ${THISDIR}/test.py
  1108. +python3 ${THISDIR}/libmagic_test.py
  1109. --- a/test/test.py
  1110. +++ b/test/test.py
  1111. @@ -37,7 +37,13 @@
  1112. self.assertEqual("text/x-python", m.from_buffer(s))
  1113. b = b'#!/usr/bin/env python\nprint("foo")'
  1114. self.assertEqual("text/x-python", m.from_buffer(b))
  1115. -
  1116. +
  1117. +
  1118. + def test_open_file(self):
  1119. + m = magic.Magic(mime=True)
  1120. + with open(os.path.join(self.TESTDATA_DIR, "test.pdf")) as f:
  1121. + self.assertEqual("application/pdf", m.from_open_file(f))
  1122. +
  1123. def test_mime_types(self):
  1124. dest = os.path.join(MagicTest.TESTDATA_DIR, b'\xce\xbb'.decode('utf-8'))
  1125. shutil.copyfile(os.path.join(MagicTest.TESTDATA_DIR, 'lambda'), dest)