About Me

I am a Data Scientist, Research Associate, Software Developer. I love implementing interesting ideas and believe only by implementing them can I learn how good they are.

My background is Information Retrieval (Computer Science). During my Ph.D., I studied location information, especially, social media users’ trail patterns.

I love Python and most of the code I wrote for my Ph.D. projects is in Python. Sometimes I need the power from Hadoop/Spark to scale up my analysis to running on several hundreds cores. Scala is another favourite language of mine, especially when I want to use Spark. I also speak other languages, for example, Java, SQL, C/C++, JavaScript, HTML/CSS.

Here are some of my projects:

  • MintSearch: A test bed for graph search by integrating Neo4J and Terrier. //Scala
  • CDRC Data: A data portal website based on the CKAN serving data sets and metadata held by CDRC. //Python/JavaScript/HTML/CSS
  • Portraitist2: A plot dashboard tool for displaying different types of data interactively based on dc.js. Demo //JavaScript/HTML
  • BakMan: A file based backup managing helper tool, with which user can select a subset of files by rules. //Python
  • lazylist: A toy stream-like structure written in Python trying to mimic the Stream in Scala. //Python
  • lispy: A toy project of building Lisp interpreter in Python. //Python