# 8 - File Storage & Organization

## Метаданные

- **Канал:** Arielle Miller
- **YouTube:** https://www.youtube.com/watch?v=QmIsWnTqKAE

## Содержание

### [0:00](https://www.youtube.com/watch?v=QmIsWnTqKAE) Segment 1 (00:00 - 05:00)

in this series of modules we will explore the methods rationale and good practices of documenting your research proper documentation is crucial for organizing your thoughts preserving knowledge and sharing your findings with the scientific Community by the end of this series of lessons you will understand the different types of lab notebooks available each type has its unique features and purposes ranging from physical notebooks to electronic platforms two be able to determine which one best suits your needs we will explore the factors to consider When selecting the right lab notebook for your research requirements recognize the significance of clear and thorough documentation not only AIDS in organizing your research but also allows for reproducibility credibility and verification of your scientific endeavors four understand what information to document is crucial for capturing the essence of your research we will explore the essential elements that should be included in your lab notebook and in your file storage five learn essential guidelines for Effective documentation these guidelines will help you establish consistency Clarity and professionalism in your scientific record keeping and six learn how to develop a publication plan for promoting your research increasing its visibility and engaging with the scientific community when starting your research project it is important to develop a strong file organization schema keeping your files organized and easily accessible is not only essential while conducting your research but also after the fact should anyone have questions about your research or for future work to have good file organization four things must be established one consistent file and folder naming conventions we will discuss the benefits of using descriptive and standardized naming conventions including relevant metadata to ensure easy searchability and avoid confusion file and folder naming conventions also help when conducting data analysis and reading in files to analytical software or codes you may develop yourself here is an example of python code reading in all the CSV files in a specific folder as identified by the folder path into a computer this is one way of importing raw data files into your code to conduct data analysis by organizing your files and folder structures in advance and considering the need to conduct data analysis you can design your file organization to reduce steps later in the research process two creating a logical and well-structured folder hierarchy is crucial for efficient file organization we will explore different approaches to organizing folders including hierarchical and functional structures three Version Control is essential for tracking changes collaborating with others and ensuring document Integrity we will delve into the benefits of implementing Version Control Systems such as git or cloud-based solutions to manage file revisions and facilitate collaborative work and four choosing the right file storage solution is crucial for secure and accessible data management we will explore different options such as local storage network drives or cloud-based platforms and discuss the considerations for selecting the most suitable file storage methods for your needs there are several key considerations and strategies to develop a naming convention that provides Clarity and enhances your file management process fundamental element of your naming convention is including the file type refers to what document was created or how the document was created for example what document being created might be the draft of a publication whereas the raw data files generated from Thermo gravimetric analysis or TGA is how the document was created by including this information in the file name it helps you quickly identify the experiment equipment or other contacts related to the file incorporating unique characteristics in your file name is valuable especially

### [5:00](https://www.youtube.com/watch?v=QmIsWnTqKAE&t=300s) Segment 2 (05:00 - 10:00)

for differentiating between similar files you can include the type of material or specific chemical compound related to the sample tested this helps to identify each file's context and its association with your research variables file format represented by the file extension is crucial for understanding the purpose the compatibility of each file whether it's a DOT PowerPoint Tiff jpeg CSV or text knowing the file format helps you determine where the file belongs within your file organization structure when conducting multiple replicates in each experiment using sequential numbers in your file names can distinguish between similar files replication of an experiment with multiple specimens allows for Meaningful statistical analysis notating the replicate number in the file name ensures easy identification of each replicate including the experiment date in the file name provides another layer of identification while the creation date of the file can change to the last modify date when viewing the file in your file explorer the experiment date is a constant value including it in your file name helps you easily relate the file back to your research plan and stay organized by developing a thoughtful naming convention incorporating file types unique characteristics file formats sequential numbers and experiment dates you can enhance your file organization and retrieval process remember a robust naming convention ensures efficient management and easy identification of files throughout your research Journey an effective file naming strategy is crucial for organizing and retrieving your files efficiently let's review some key guidelines to optimize your file names when naming your files it is important to manage the length of the file name long names and unnecessary elements can lead to truncation when viewing files making it easier to miss the file you need additionally operating systems like Windows have a path length limitation so keeping the file name concise is essential consider establishing code shorthand or acronyms such as using TGA instead of thermographic underscore analysis to save characters special characters and file names can cause issues when loading files into Data analysis codes or other software some characters are not allowed While others like periods can create confusion for the software however the underscore character is generally safe and serves as a useful separator in file names it is essential to avoid special characters that may have specific meanings with encoding languages and could generate errors the underscore also serves as a convenient line split in the file name if done correctly you will have a lot of useful metadata in your file name that can be easily extracted during data analysis using codes like r or python if you separate them with an underscore that information then can be used to create tables and graphs of your analyzed data with the necessary context thereby providing a finished product for publication in few steps when including dates in your file names it is best to adhere to the iso format four digit year two digit month and two-digit day avoid using month names and always include the year research often spans multiple years and precise dating helps track file generation placing the year at the beginning of the date ensures files can be sorted chronologically making them easier to view and locate when sequential numbering is necessary it is recommended to use leading zeros and a three-digit numbering system by starting with xero such as 0 1 0 2 and so on the files will remain in the numerical order when sorted leading zeros ensure proper alignment and prevent confusion this approach helps maintain file organization and facilitates accurate

### [10:00](https://www.youtube.com/watch?v=QmIsWnTqKAE&t=600s) Segment 3 (10:00 - 15:00)

file retrieval by implementing these best practices including managing length avoiding special characters utilizing date formats and employing sequential numbering with leading zeros you can streamline your file organization and enhance your ability to locate and work with files effectively in addition to effective file naming organizing your folders with a well-designed hierarchy is vital for efficient file management by establishing a thoughtful folder structure you can easily navigate and locate your files based on relevant attributes the top level folder serves as the foundation of your folder structure and should be organized by the most relevant attributes of your research for example you can use the year as the top level folder to separate different research periods you might specify the years explicitly such as 2019 2020 and so on or simply refer to the folders by year one year two Etc like with the file names keeping the folder names concise and utilizing code or acronyms helps ensure the overall path length remains within the character limit underneath the top level folder you should create subfolders to further categorize your files for example you may have a folder for meeting minutes and updates to store all relevant documents another folder can be dedicated to experiments with separate folders for different types of experiments such as tensile TGA sem and more lastly a folder for papers or publication drafts can be included to keep all your research outputs organized the key principle behind the folder structure is to group files based on their relevance and relationships by storing files in their corresponding folders you create a logical system that reflects the different aspects of your research this allows for seamless navigation and retrieval of files when needed when it comes to organizing folders following best practices is just as important as with file names to ensure streamlined folder structure is crucial to avoid overlapping categories and folder redundancy When selecting the top level folder consider the most repeated attribute in your files such as the year for instance instead of having separate top level folders for TGA experiments and sem create a single top level folder for the year such as year one or 2019. and place all experiment folders and files conducted in that year Within this approach eliminates redundant categories and simplifies the overall structure also avoid using the year or other folder name as part of subsequent folder names especially within the same hierarchy creating a folder structure that strikes a balance is key avoid having folders with an excessive number of files which can make it challenging to locate specific items on the other hand overly nested folders can lead to increased navigation complexity to strike the right balance start with commonly used attributes at the top level and progressively move towards more unique attributes take the time to plan and map out your folder structure in advance ensuring it meets your specific needs and maintains an optimal balance between size and nesting depth remember that the same guidelines that govern file names also apply to folder names avoid overly long folder names and names that include special characters underscores or camel case is a good way of creating separation between important folder name attributes Version Control is a crucial aspect of file management providing a historical reference for changes made their timing and the individuals responsible whether utilizing program specific Version Control features or creating dedicated version documents implementing Version Control safeguards your work facilitates quick retrieval of specific file versions and minimizes data loss and rework manual Version Control allows you to track file versions using file naming conventions or change logs by structuring your file names to include version numbers or utilizing a separate log to record changes you can easily identify and differentiate between

### [15:00](https://www.youtube.com/watch?v=QmIsWnTqKAE&t=900s) Segment 4 (15:00 - 20:00)

different iterations of your files manual Version Control is a flexible approach that can be adapted to various file types and Management Systems providing a clear history of modifications made automatic Version Control is offered by many word processing programs such as Google Docs or Microsoft Word these tools can be set up to automatically save and track versions made to your documents allowing you to review and revert to previous versions additionally cloud storage services like Google drive or OneDrive often provide automatic version history preserving older iterations of your files this automated approach simplifies the tracking process and minimizes the risk of data loss for more complex Version Control needs tools like git and GitHub are available primarily designed for managing changes in computer code these platforms allow you to track modifications collaborate with others and maintain a comprehensive history of your files although typically used for code management they can also be utilized for Version Control in other file types providing a centralized and efficient system for tracking changes implementing Version Control brings several benefits to your file management process it helps you locate the correct version of your file quickly minimizing the risk of using outdated or incorrect information control also acts as a safeguard against data loss ensuring that previous iterations of your files are preserved additionally Version Control reduces the need for extensive rework by enabling easy retrieval of specific versions and facilitating collaboration and documentation within your research project there are four types of file storage to consider when determining where and how to house your data local storage such as your laptop or desktop's hard drive serves as the primary location for your research files most of your files will be generated from this local device it is crucial to maintain a well-organized folder structure and consistent file naming conventions to ensure easy access and management external storage offers an additional layer of security and backup for your research files consider utilizing a large external hard drive or cloud storage service Azure secondary storage location storing your files externally provides protection against data loss in case of Hardware failures or unforeseen events it is recommended to match your folder structure and file names across all storage locations if you have a Chromebook or a laptop with small hard drive space or use Google Docs or Microsoft 365 to do a lot of your work I recommend using the cloud storage as your main storage location and the physical external hard drive Azure backup when conducting experiments using air-gapped computers not connected to the internet the initial storage location for your raw data files will be these computers to transfer the files to your main storage location you will need a suitable storage medium USB storage keys are ideal for securely transferring data between air gapped computers and your primary storage devices even if the computer connected to the experimental equipment is connected to the internet it is still better to transfer your files from the computer to a USB storage key as a backup in case upload to your cloud storage fails universities often provide a virtual Computing laboratory for students to access sophisticated software and perform computationally intensive tasks when using a vcl or other third-party server hosted software your generated data will initially be saved on that server or virtual space the lab May provide a shared drive that you can access from outside the virtual environment to transfer your files alternatively you may need to utilize a file transfer protocol or FTP to transfer data from the virtual environment to your main storage regardless of where your files originate ensure consistency in file naming and promptly transfer files to avoid data loss consistency and promptness are key when transferring files between different storage locations

### [20:00](https://www.youtube.com/watch?v=QmIsWnTqKAE&t=1200s) Segment 5 (20:00 - 25:00)

whether it's transferring files from lab equipment computers virtual computer lab servers or shared drives maintaining consistent file naming conventions is crucial additionally promptly transfer files to avoid data loss as equipment computers and virtual computer lab servers are often used by multiple individuals and may undergo regular data wiping processes you will generate a lot of files while conducting your research most but not all will require storage and organization there are four main types of files you will generate that will require storage and organization and can serve as a guidepost in making the determination throughout your research project the first is raw data files are generated directly because of conducting experiments these files contain Vital Information and serve as the foundation for your research examples include data files from instruments equipment outputs or any data collected during experiments the second is process data files are the outcomes of analyzing the raw data files these files can include processed images numerical values charts graphs or any derived data used to draw conclusions or support your research findings the third is research documents play a significant role in documenting and communicating your work these files include draft conference proceedings Journal articles conference presentations dissertation chapters meeting minutes status updates or reports and laboratory notes saving and organizing these documents within your file organization schema ensures easy access and comprehensive record keeping the fourth is software documents include documents generated by a design analysis or statistical software package they are unique to that software and hold all the input data and instructions used to produce output files and data they should be saved and documented both for historical purposes and in case you need to make revisions or updates the category of software document files also includes any code files you generated to conduct data analysis or other tasks related to your research these files can be written in Python R Perl or other coding languages in both cases the software documents have their own unique file formats and corresponding extensions when deciding which files to include in your file organization schema consider the nature and relevance of the file raw data files and process data files are integraled to your research and should be saved accordingly as are software documents research related documents such as draft papers presentations meeting minutes and laboratory notes for essential for documenting and tracking your progress start planning your file organization at the beginning of your research project it is easier to stay organized from the start but if you find yourself in the middle of a project without organization don't worry it's never too late to start you can organize existing files or create a folder called pre-organization to hold them use your new file organization plan for all subsequent files document all your file organization decisions create a readme text file that explains your naming conventions codes or acronyms and folder structure this documentation will be invaluable when transferring your research to another researcher or when you revisit your work years later and need to understand your organization include the locations of your main and backup file storage consistency is key to successful file organization apply your file organization plan consistently to ensure it becomes a habit plan in a way that works intuitively for you to encourage adherence establish folder structures in advance even if they remain empty for a while set up regular backups to your Cloud Server or external hard drive to maintain consistency

### [25:00](https://www.youtube.com/watch?v=QmIsWnTqKAE&t=1500s) Segment 6 (25:00 - 26:00)

always carry a USB storage key with you when working in Labs or other locations you may need to transfer files quickly being prepared with a portable storage device ensures you can easily move files between computers or devices as needed back up your files regularly and apply Version Control avoid overwriting files and instead create saved versions such as backup underscore research data underscore date or V1 regular backups and Version Control protect your work from loss or accidental changes providing a historical reference for your files by following these final tips you can effectively organize your research files and maintain a structured workflow planning documenting decisions consistency carrying a USB storage key regular backups and Version Control are all essential components successful file organization Implement these strategies and enjoy the benefits of an organized and efficient research process in the resources section of this lesson I have included links to additional information on file storage and organization that can provide you with more guidance on this topic

---
*Источник: https://ekstraktznaniy.ru/video/42677*