Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some new preview systems for common file types: #91

Open
victordomingos opened this issue Sep 18, 2018 · 6 comments
Open

Add some new preview systems for common file types: #91

victordomingos opened this issue Sep 18, 2018 · 6 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Milestone

Comments

@victordomingos
Copy link
Owner

In the standard library, there are tools for working with common file types. Text files can be read.
and what kind of data can be used for the preview images or PDF?

Originally posted by @NataliaBondarenko in #84 (comment)

@victordomingos
Copy link
Owner Author

The Standard Library has indeed a few tools that we may consider to use for this purpose without having to dive into lower level APIs or third-party packages:

Databases:

  • sqlite3

Text based data files:

  • csv
  • json

Compressed archives:

  • zipfile
  • gzip
  • tarfile
  • zlib
  • bz2
  • lzma

Audio:

  • aifc - Read and write AIFF and AIFC files
  • wave - Read and write WAV files
  • sndhdr - Determine type of sound file

Images:

  • imghdr - Determine the type of an image

Maybe this one can be useful also:

  • mimetypes: Map filenames to MIME types

For other file types, like PDF, XLSX, DOCX, and images, we can use a third-party package like PIL and others. In my opinion, these should not be required as runtime dependencies: the application should run with or without them but, in case they're present, it should try to use them to present more meaningful preview info.

Or, in some cases, we may try to write custom functions to read and interpret the binary data, in the cases the file header is public and well documented, but I think in tends to be a more troublesome road...

@NataliaBondarenko
Copy link
Collaborator

There is one crazy idea how to preview the contents of a media file and avoid additional dependencies.
Some formats are supported by browsers. Open some path:

preview

Actually in this case the html-document is formed. And by clicking on the link, you can see the contents of the image and some text formats.

preview_png

preview_text

@NataliaBondarenko
Copy link
Collaborator

You can generate a report with sorted local links to pictures as a more compact list in the HTML document.
In the terminal it can not be seen.

@victordomingos
Copy link
Owner Author

Count-files is primarily a text mode, console based, application. The main functionality should be made available through the console itself. But it is an interesting feature. I think it can be added as a new option (--use-browser, or something similar) as a more detailed preview mode, but there should be a text-mode version first for each format, even if the information displayed is very basic.

@victordomingos victordomingos added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Sep 5, 2019
@NataliaBondarenko
Copy link
Collaborator

Preview for binary:
We can use the file signature for a preview of the binary files.
show the signature and/or detect known file types:

>>> with open('path/to/file.png', 'rb') as f:
	bytes(f.read(20))

	
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x06@'
>>> bytes.fromhex("89 50 4E 47 0D 0A 1A 0A") in _
True
>>> 

What do you think about this?

@victordomingos
Copy link
Owner Author

It's a good idea. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants