Data Scientist and Engineer

Hi, I'm Augustin Tsang

I build advanced computational models, work with big data, and develop scalable software solutions, with a budding passion for commodities markets. Welcome to my portfolio.

Image

About

Me

Hi, I'm Augustin, a senior at Berkeley. I'm pursuing a double major in Computer Science & Data Science whilst obtaining the Certificate in Technology & Entrepreneurship. I aspire to be using AI/ML to deeply understand global macroeconomic trends with commodity market fundamentals and building tools with the objective to facilitate international exchanges of goods, services and information.

View Resume

My

Skills

(1)

Data Science and Engineering

Data Science and Engineering involves extracting insights and knowledge from data using statistical, computational, and engineering techniques. It encompasses data collection, cleaning, analysis, and the application of machine learning models to solve complex problems.

(2)

Machine Learning and Deep Learning

Machine Learning and Deep Learning focus on building algorithms and neural networks that enable computers to learn from and make decisions based on data. This skill includes a range of techniques and frameworks to create models for various tasks.

(3)

LLMs and Natural Language Processing

LLMs and NLP involve training and using large language models and other techniques to understand, interpret, and generate human language. This skill includes tasks such as text classification, sentiment analysis, machine translation, and more.

(4)

Cloud Computing

Cloud Computing entails delivering computing services like storage, processing, and networking over the internet. This skill includes deploying, managing, and scaling applications on cloud platforms, leveraging technologies and best practices for efficient and secure cloud operations.

(5)

Computer Vision

Computer Vision involves training computers to interpret and make decisions based on visual data from the world. This skill includes tasks such as image classification, object detection, image segmentation, and facial recognition.

(6)

Database Management

Database Management involves the efficient storage, retrieval, and manipulation of large scale data. This skill covers a range of technologies and best practices to ensure data integrity, security, and optimal performance in handling large datasets.

Project

Highlights

Image of Breast Cancer Severity Assessment with CNNs

Breast Cancer Severity Assessment with CNNs

A computer vision model to accurately assess breast cancer severity from biopsy images

Image of Commodities and Supply Chain Deep Learning Model

Commodities and Supply Chain Deep Learning Model

A model to predict tech stock prices using commodity and supply chain data

Image of Database Management System Implementation

Database Management System Implementation

An advanced database including indices, query optimization, and concurrent transactions

Image of PintOS

PintOS

An enhanced OS with advanced scheduling, synchronization, and process management

Image of EEG Focus for YouTube

EEG Focus for YouTube

An EEG-based system summarizing YouTube content during low attention periods, winners of NVIDIA hackathon

Image of Predicting Cook County Housing Prices

Predicting Cook County Housing Prices

A model to predict Cook County housing prices using a dataset of over 500,000 records

My Experience

Here are some of my work experiences where I've turned challenges into accomplishments, making things happen.

Deepr

May 2024 - Present

ML Engineer

I developed and implemented an advanced LLM-based song matching system that intelligently recognizes song titles, artist names, and music creator name variants from diverse data sources and API calls. Utilized natural language processing (NLP) techniques to enhance data preprocessing, model training, and integration for a high-accuracy matching algorithm, significantly improving the user experience for a music recommendation platform.


EpiNu (Blum Centre for Developing Economies)

January 2024 - May 2024

Computer Vision Engineer

I led the development of a prenatal micronutrient security algorithm in the DRC under Dr. Sonia Navani, utilizing a custom YOLO model in PyTorch integrated with a comprehensive SQLite micronutrient database. This integration achieved a 55% reduction in computational load, making it effective in low-bandwidth scenarios. Additionally, I implemented a K-Nearest Neighbors algorithm to suggest alternative meals, optimizing nutritional intake for prenatal care.


Climformatics

August 2023 - December 2023

Data Scientist

I improved ice cover sensitivity predictions by 15% through training and testing ML models on albedo perturbation and cloud cover, presenting results at the American Geophysical Union conference. Conducted timeseries analysis on Arctic data to optimize albedo enhancement locations, reducing costs by 20%. Used xarray and Sabalcore supercomputing to compare regression and tree-based models, boosting prediction accuracy with Python and sklearn.


Contact Me

Have a question or want to work together? Send me a message using the form.

Email: augustintsang@gmail.com