Posts

Showing posts from April, 2014

Australian Universities on Weibo

There are quite a number of Australian Universities on Sina Weibo 新浪微博. Below I list those I have found. I only include those Australian Universities that have verified (the big blue V after the username) Weibo accounts. The text in square brackets is the username on Weibo. Australian Catholic University [@ACUInternational] weibo.com/acuinternational Charles Darwin University [@查尔斯达尔文大学] weibo.com/charlesdarwinuni Curtin University [@科廷大学CurtinUniversity] weibo.com/CurtinWestAustralia Deakin University [@澳大利亚迪肯大学] weibo.com/deakinuniversity Federation University [@澳大利亚联邦大学FedUni] weibo.com/FedUniAustralia Flinders University [@FlindersUni弗林德斯大学] weibo.com/flinders2011 La Trobe University [@澳大利亚拉筹伯大学] weibo.com/latrobeuniaus Macquaire University [@澳大利亚麦考瑞大学] weibo.com/mquni Monash University [@MonashUni澳大利亚蒙纳士大学] weibo.com/monashuniversityaust Queensland University of Technology [@QUT昆士兰科技大学] weibo.com/qutbrisbane Southern Cross University [@澳大利亚南十字星大学] weibo.com/scuchina Swinburne Univ...

Regular Expressions

Regular Expressions are not just about ASCII. They are (or should be) about Unicode, with ASCII being a very small subset of Unicode. The vast majority of Regular Expressions documentation and tutorials I have seen, only deal with ASCII. The consequence is that many/most will never consider non ASCII text strings. If one considers Unicode text strings then one can process text strings consisting of non Latin Scripts and Symbols. Scripts such as: Cyrillic, Devanagari, Tamil, Georgian, Cherokee, Chinese and Sinhala. Symbols such as: Currency, Arrows, Mathematical Operators, Mahjong Tiles and Playing Cards. Unicode has a repertoire of over 100000 characters which can be processed with Regular Expressions. Mostly, Regular Expressions are no different when using Unicode as compared to using the very limited ASCII. I will give some simple examples using Hangul, which is the Script used for writing Korean. The Hangul characters I will be using in the examples below are in Unicode block Hangu...