Skip to content

Latest commit

 

History

History
executable file
·
331 lines (176 loc) · 7.2 KB

01_file_naming.md

File metadata and controls

executable file
·
331 lines (176 loc) · 7.2 KB
layout title nav_order
default
File Naming
4

Why is file naming important?

{: .no_toc }

Creating a well-organized hierarchy of files with clear naming conventions is an important part of improving your research process. This is especially important if you are working with large data sets and complex output files or coordinating with multiple people at multiple institutions. There are many ways to structure your folders, and multiple naming conventions you can use. The key is consistency. Make your file names descriptive, and include information about dates and versioning. The best practice is to consult with your lab or with your co-workers to develop a naming schema that everyone is willing to follow consistently.

Looking for a cheat sheet? Check out our one-pager! {: .note}

Table of contents {: .text-delta } - TOC {:toc}

What do you think about the following file names?

  • 10_data 2.txt
  • figure 1.png
  • final revision.docx
  • Lily's schedule&plan 2022Jul9.xlsx

This is what happens when you do not have effective naming conventions:

Are these names better?

  • better-filenames.txt
  • 003_raw-data_2022-07-09.txt
  • fig01_scatterplot-talk-length-vs-interest.png
  • 20220709_interview-script_v01.docx

Follow three principles!  

{: .no_toc }

1{: .circle .circle-blue}  Machine-Readable

2{: .circle .circle-red}  Human-Readable

3{: .circle .circle-yellow}  Plays Well With Default Ordering


1{: .circle .circle-blue}   Machine-Readable

Goal

{: .no_toc }

  • Characters in file names are handled correctly by all computer systems
  • Be consistent with the chosen naming convention

Only use the following:

{: .no_toc }

  • Alphanumeric characters (alphabetic characters and Arabic numerals)

  • Use _(underscore) or -(hyphen)to separate words/numbers (snake case).

  • this_is_snake_case

  • Use capitalization to separate words/numbers (camel case).

  • thisIsCamelCase

  • Avoid spaces and other special characters, such as: ~ ! @ # $ % ^ & * ( ) ` ; : < > ? . , [ ] { } ' " |

  • Certain special characters are used by OS to perform tasks.

  • Can be a pain to read in files with special characters in the file name.

Be Consistent:

{: .no_toc }

  • The way you name your file should be consistent
  • The style you choose should be based on conventions adopted in a given project, organization, language, etc.
  • For example, R and Python use snake case, so file names and folder names should follow this convention.

Exercise 1

{: .no_toc }

Let's try to improve the file names! Pick your favourite file name and make it more machine-readable!



2{: .circle .circle-red}   Human-Readable

  • File names provide concise information -Names are easily understandable to anyone who accesses them in future (including future you)
  • File names should be consistent
  • File names should be concise but detailed
  • Names should be detailed but not too long
  • How much (or little) detail is a subjective decision
# not good
a.txt


# okay but can be a more detailed
application.txt


# good, just enough detail
ubc_application_letter.txt


# not amazing, a bit too much detail
ubc_application_letter_for_institution_position_firstname_lastname_final_date.txt
  • Avoid application (software) details in names
  • File names do not need details (e.g., type of file). This makes the name too long. As well it is also unnecessary.
# too much information
clean_data_py_script.py


# good
clean_data.py


3{: .circle .circle-yellow}   Plays Well With Default Ordering

Goal

{: .no_toc }

  • Decide at the beginning how you want to sort and search for your files:
    • Chronological order
    • Logical order

Chronological order

{: .no_toc }

  • Use ISO 8601 standard for dates: YYYYMMDD or YYYY-MM-DD

    • Using the ISO 8601 standard ensures that dates are consistently formatted and correctly interpreted. For example, 10-12-2024 will be interpreted as DD-MM-YYYY in Europe but MM-DD-YYYY in North America.
# automatically generated monthly reports for different subjects


2024-01-01_subject_1_results.xlsx
2024-01-01_subject_2_results.xlsx
2024-01-01_subject_3_results.xlsx
2024-02-01_subject_1_results.xlsx
2024-02-01_subject_2_results.xlsx
2024-02-01_subject_3_results.xlsx

Logical order

{: .no_toc }

  • When using a sequential numbering system, use leading zeros to make sure files sort in sequential order. e.g. 001, 002, 010, 011....100,101 ...
  • Order elements from general to specific to make searching easier
00_innit.R
01_read_data.R
02_clean_data.R
03_model.R
04_visualize.R
helper01_load_variables.R
helper02_functions.R


Exercise 2

{: .no_toc }

Your lab has a spectrometer that is measuring thermal emissions once a day for a year for your experiment. There are three people who take that measurement in the lab.

Please create a file naming convention for these files to reflect what you have learned about file naming in today's session.


Congrats!

{: .no_toc }

Now you know how to organize files with your own file naming conventions! As long as your names are clear and consistent, you are good to move forward!



Sources

{: .no_toc }


Need help? {: .label .label-blue } Please reach out to [email protected] for assistance with any of your research data questions.