Automation of Data Collection, Management and Analysis within the scope of Biodiversity Consulting

Thesis Date
Thesis Author
Miguel Ventura Gomes Paulo
Thesis Supervisor(s)
Pedro Segurado
Daniel Filipe Carvalho Miranda Pires
Thesis Summary

The automation of biological data collection and analysis processes has a huge potential to improve the efficiency and quality control of ecological studies. This work focuses on the data pipeline automation in a biodiversity consultancy company. The primary objective of the work was to develop an integrated methodology to automate the entire process from data collection to statistical analysis, thereby improving the quality and speed of analyses and contributing to more informed decisions in ecological studies. The methodology involves optimizing digital forms, implementing an API for automated data collection, and storing data using PostgreSQL.

To contribute to the automation of processes, an application was developed in Python, which is a significant part of this work. This tool allows users to produce outputs, such as charts, tables, and even shapefiles, in a matter of seconds. This saves a considerable amount of time in searching for relevant data and then producing the necessary outputs. 

Additionally, this internship included some work on statistical analysis in an ongoing project, creating scripts to process big datasets to extract meaningful insights and trends. Two supplementary Python applications were also developed to simplify current company processes, further reducing manual effort and minimizing the potential for human error.

By integrating these automated solutions, an initial phase saw a faster and more efficient process for data collection, selection and output creation. Since this process was automated, there was no longer a need to go file by file in search of the most relevant data for a certain output, as everything was stored in a single database. This approach not only demonstrates the effectiveness of automation in handling large-scale biological data by simplifying processes and reducing the time spent on these tasks, but also sets a precedent for future technological advancements in biodiversity consultancy.

Thesis Type
Internship