cheng@portfolio:~$
$ whoami

CHENG, CHAO-HSIANG

$ cat role.txt

Data Science & NLP Researcher

$ cat interests.json
{
  "interests": [
    "Text Mining",
    "Machine Learning",
    "Language Acquisition"
  ],
  "affiliation": "NTUST",
  "status": "Building with data"
}
CHENG, CHAO-HSIANG

$ cat about.md

Researcher

Conducting quantitative research in foreign language acquisition using SEM, IRT, and multilevel modeling at NTUST.

Developer

Building interactive tools and research software with Python, R, Next.js and machine learning frameworks.

Educator

Teaching assistant for ML, text mining, and language acquisition courses. Contributing editor to academic publications.

$ cat education.log

B.A. in Applied Foreign Languages

National Taiwan University of Science and Technology

Current

Sept 2024 – Sept 2026 (Expected)

Double Major: Electrical and Computer Engineering

Minor: Computer Science and Information Engineering

GPA: 4.28 / 4.3

Coursework: Text Mining, Information Security, Big Data Analysis, Data Science, NLP in Generative AI, ML for Text Mining, Programming Language, Data Structure

Exchange — Computer Science & Information Engineering

The Catholic University of Korea, Seoul

4.5/4.5

Sept 2023 – Dec 2024

Exchange — Business Informatics

University of Leipzig, Germany

1.0/1.0

Mar 2023 – Jun 2023

Associate Degree in English

Wenzao Ursuline University of Languages

First in Class

Sept 2019 – June 2024

Minor: German

GPA: 4.1 / 4.3

$ ls -la research/

Research Assistant

2024 – Present

Lab of Data Analytics in Human Science, NTUST

lab-website-kohl.vercel.app
  • Quantitative research in foreign language acquisition using SEM, IRT, and multilevel modeling
  • Applying machine learning and text mining to analyze language learning processes
  • Developing interactive educational tools and research software
  • Collaborating on meta-analyses of language learning motivation

Research Assistant

Feb 2025 – Sept 2025

Law & Technology Innovation Center, NTUST

www.ltic.ntust.edu.tw
  • Analyzed geothermal project technical parameters, cost structures, and regulatory mechanisms
  • Designed third-party audit systems and data withdrawal mechanisms for health data
  • Statistical summaries and analytical reports using Python

NSTC Undergraduate Research

Accepted

August 2025

Text Mining and Machine Learning in Religious Scriptures: A Data-Driven Analysis of Value Alignment Between the Bible, the Dhammapada, and the Tao Te Ching with Taiwan's Generation Z

NSTC Undergraduate Research

Accepted

August 2025

Digitization and Preservation of Indigenous Language Through Software Development: In the Case of Atayal

Contributing Editor

August 2025

Contributed to Text Analysis in Social Sciences: Applications of R by Prof. Wen-Ta Tseng, published by Wu-Nan Book Inc.

Book Link

$ cat projects/*.json

ML Password Strength Assessment

Markov model-based password strength system comparing with zxcvbn. 4-gram analysis with Laplace smoothing.

AUC ~0.75 Precision ~60% Recall ~75%
Python Scikit-learn Markov Chain

Text Mining: Bilingual Education Policy

Large-scale text mining on 613 news articles (421,879 words) with sentiment analysis, TF-IDF, and co-occurrence network analysis.

613 articles 421K words
R Selenium OpenAI API NLP

Multimodal Grammar Chatbot with RAG

Intelligent grammar correction chatbot for English articles, integrating RAG and fine-tuning with file upload and screenshot analysis.

Loss: 4 → 0.18 60 epochs
Gemma3 LlamaFactory PyTorch

Instagram Donation Platform

Industry collaboration with PSK Cosmetics. Interactive donation system for Taiwan's Whale & Dolphin Association with automated lottery.

Brand Engagement ↑
Next.js React MongoDB Graph API

$ grep -r "teaching" experience/

TA — Applying Machine Learning to Text Mining

Sept 2025 – Present
  • Instructed students in ML algorithms for text mining applications
  • Taught Random Forest and Bayesian models for text classification
Scikit-learn Random Forest Naive Bayes

TA — Language Acquisition

Mar 2025 – Jun 2025
  • Taught deploying local LLMs and fine-tuning to simulate language acquisition theories
  • Demonstrated fine-tuning workflows using LlamaFactory
Gemma3 LlamaFactory PyTorch

TA — Text Mining and Analysis

Sept 2024 – Dec 2024
  • Instructed students in text mining workflows and methodologies
  • Guided hands-on practice in preprocessing, tokenization, and analysis
R Python NLP

$ cat honors.log

Competitions

Final Round

NODASS Ocean Big Data Contest

National Academy of Marine Research, 2025

Optimizing Marine Conservation through Data Mining: MPA Effectiveness Assessment Using eDNA and ML

2nd Place

4th NTUST UN SDGs Presentation Competition

2025

Mindful Minds — Developing a Mental and Physical Health Education App

1st Place

AIoT Innovation System Training

National Taiwan University, Jul 2025

First place in the Final Project Competition on Artificial Intelligence of Things

Scholarships

2025

NSTC Undergraduate Research Scholarship

Religious Scriptures Text Mining Project

2025

NSTC Undergraduate Research Scholarship

Indigenous Language Digitization Project

2025

Academic Excellence Scholarship

NTUST

2024

Academic Excellence Award for Graduating Students

First in Class — Wenzao Ursuline University

2023

MOE Overseas Exchange Financial Assistance Grant

Ministry of Education

2022

Academic Excellence Scholarship

2020

Outstanding Conduct & Academic Performance

Fall 2020 & Spring 2020

2019

Academic Excellence Scholarship

Certifications

Python Programming — NTU

Jan 2025

C++ Programming 101 — NTU

Feb 2025

AIoT Practice with Arduino — NTU

Jul 2025

$ echo $CONTACT

contact.json
{
  "name": "CHENG, CHAO-HSIANG",
  "email": "a11317005@mail.ntust.edu.tw",
  "phone": "+886-970-733-372",
  "github": "qazasd2518995",
  "location": "Taipei, Taiwan",
  "available": true
}