I've searched around but can't seem to find any information on the topic.
How do you set the MIME type for an existing file?
For example, I'm trying to create a file with type text/html
1 Answer
MIME types are not actually stored on the filesystem. They are merely a convenient way of knowing how to process a file. To get the MIME type, you have to run a program.
Some programs will detect the mime type of a file solely by looking at the file extension, while others will check the file for a magic number or a special magic pattern(essentially a regex).
As an example, run touch test.html, which creates an empty file.
Then run xdg-mime query filetype test.html or mimetype test.html. Both of them will return the type text/html.
However, if you run file --mime-type -b test.html, it will return inode/x-empty.
So, if you want all programs to act the same way on your file, the file should have the proper format (data) along with the correct extension.
What is a magic number?
Running
xxd image | head -1on my profile image creates an output as follows:
00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452 .PNG........IHDRAccording to Wikipedia,
89 50 4E 47 0D 0A 1A 0Ais the standard header for allimage/pngfiles.
How does the file command work?
From the file(1) man page:
There are three sets of tests, performed in this order: filesystem tests, magic tests, and language tests. ...
The filesystem tests are based on examining the return from a
stat(2)system call. The program checks to see if the file is empty, or if it's some sort of special file. ...The magic tests are used to check for files with data in particular fixed formats. ... These files have a 'magic number' stored in a particular place near the beginning of the file that tells the UNIX operating system that the file is a binary executable, and which of several types thereof. ... If a file does not match any of the entries in the magic file, it is examined to see if it seems to be a text file. ...
Any file that cannot be identified as having been written in any of the character sets listed...is simply said to be 'data'.
How does file know what magic patterns to use?
Again, from the file(1) man page:
The information identifying these files is read from the compiled magic file
/usr/share/misc/magic.mgc, or the files in the directory/usr/share/misc/magicif the compiled file does not exist. In addition, if$HOME/.magic.mgcor$HOME/.magicexists, it will be used in preference to the system magic files. If /etc/magic exists, it will be used together with other magic files.
If I run strace file image |& grep magic we can see that file command looks for these files:
/usr/lib/x86_64-linux-gnu/libmagic.so.1 => libmagic(3) ~/.magic.mgc
~/.magic
/etc/magic.mgc
/etc/magic
/usr/share/misc/magic.mgc
There are other files like /etc/mime.types which other programs use. This file assigns an extension to a mime type. For example, grep -i text/html /etc/mime.types produces:
text/html html htm shtml 12