Dr. Alvina Goh, Director, Data Science & Artificial Intelligence Division (DSAID), GovTech spoke at the STACK-X Data Science Connect conference held yesterday (July 7) on how data science and artificial intelligence (AI) is used in Singapore.
According to Goh, the mission of data science and AI is to support government digital transformation to have better data-driven, timely, adaptive and targeted public policies; a more productive and efficient public sector workforce; and fast, anticipatory, personalised public service delivery.
Dr. Alvina Goh and moderator Watson Chua / Image credit: STACK-X Data Science Connect conference
She adds that this can be achieved through four key services and products: consultancy, incubation, software products and whole-of-government (WOG) platforms, and capability development.
Consultancy mainly refers to collaborating with agencies to deliver high-impact projects that apply data science and AI to enhance policy decision making, service delivery or operations by providing advice on what can be done based on the data provided through data visualisation dashboards and technical advisories.
Meanwhile, incubation of AI tech solutions for the public sector domains are done in two ways — internally, and externally, through developments of external partnership. This results in in-house developed proof of concepts (POCs) in partnership with other companies and research institutions.
In terms of software products and WOG platforms, which Goh notes has been on a rise, they are mainly to enable agencies to be capable of self-help.
“[W]hat we really want to achieve here is to execute and deploy data science and AI treatment solution[s] at scale,” she explains. The output would be about making use of DSAID-driven products and platforms to support WOG efforts, such as the smart nation sensor platform.
Lastly, capability development is about supporting the transformation of public sector agencies through the building of data science and AI capabilities. This is done through playbooks, guides and competency framework and training, workshops and more.
She goes on to cite several DSAID use cases — that make use of data science and AI — employed across different government agencies in Singapore:
1. Automated valuation for Housing Development Board (HDB) resale portal
The use of prediction to reduce human processing was incorporated into the HDB resale portal in 2017.
In collaboration with GovTech, the HDB portal created about S$2 million savings in valuation fees for about 90,000 resale flat buyers annually, according to Goh.
The main tool used is automated calibration. Machine learning algorithms are leveraged to streamline the valuation process in terms of transaction by indicating whether the price is reasonable.
Machine learning algorithm / Image credit: STACK-X Data Science Connect conference
The above diagram provided by Goh summarises how it has helped to improve the system, resulting in several benefits to buyers: cost savings of more than S$100, and a reduced timeline of about six working days where valuation is waived.
2. National Environment Agency (NEA) rodent prediction
DSAID works with NEA by providing predictive models for pre-emptive actions, which in this case, is to predict the risk of rat burrows in each site. Each site is first assigned a risk score before they are ranked, which allows the NEA to prioritise higher-risk areas.
Previously, NEA would only notify the relevant stakeholders to address rat burrows upon discovering them. In contrast, this project allows NEA to be more pre-emptive so stakeholders can be notified ahead of time.
Goh shares that the data used to build this model is brought in from these agencies that try to predict rat burrows:
- Construction site maps from the Building and Construction Authority (BCA) — piling work is known to displace rat populations from their burrows
- Food eatery locations from the Singapore Food Agency (SFA) — identifies potential food source for rats
- Terrain types from the Singapore Land Authority (SLA) — affects where burrows are sited
- Public feedback from the Municipal Services Office (MSO) — to identify rat burrow sites and food-related littering
Sensitivity analysis for rat burrow sites / Image credit: STACK-X Data Science Connect conference
With this data, the team embarking on the project divided Singapore into “hexbins”, each of a 50-metre radius which corresponds to the average travelling distance of a rat, where they then use the data to predict the probability of rat burrows in each hexbin.
“We actually use three years’ worth of training data to predict for the next two months,” shares Goh.
In addition, sensitivity analysis is also performed to understand the different drivers for each hexbin, allowing stakeholders to take pre-emptive and localised action.
3. Recommendation systems for MyCareersFuture
The recommendation systems used in jobs portal MyCareersFuture aims to allow job seekers to find suitable jobs quickly by offering better choices from a vast range of options.
“[W]e have developed an engine where when job seekers go on MyCareersFuture’s website, and [they] search for jobs, the engine will actually return personalised suggestion[s], which is a unique set of jobs based on your profile,” details Goh.
The engine first gathers users’ implicit data such as user behaviour in terms of their clickstream job applications, as well as their explicit data such as the user’s skillsets in order to give a stronger signal of what would be a good match.
The algorithms, like collaborative filtering and content-based models, are then tailored based on use case and ambition to provide a seamless browsing experience.
Recommendation systems / Image credit: STACK-X Data Science Connect conference
Based on DSAID’s test runs, the engine is able to increase clickthrough rates significantly by 67 per cent, number of applications made to recommended jobs by 90 per cent, and overall applications by nine per cent.
4. SkillsFuture Fraud Analytics
Fraud detection was first implemented when SkillsFuture Singapore (SSG) encountered a series of scams.
DSAID was approached to set up AI algorithms to flag potentially fraudulent claims through the approach of using an unsupervised machine learning algorithm that can better detect fraudulent behaviour.
Goh makes use of the following illustration to elaborate on its mechanisms. The blue fishes represent normal transactions, while the red reflects that of a typically fraud behaviour, and the yellow corresponds to a new case of fraud.
Illustration of fraud analytics / Image credit: STACK-X Data Science Connect conference
Since supervised learning is known to be effective for detecting typical fraudulent behaviour, but might not be able to pick up a new one, DSAID’s algorithm’s unsupervised learning is made to better detect both types of fraud by filtering out the yellow and red fishes as anomalous.
5. GovText: Text-Analytics-as-a-Service
HDB initially received a lot of emails over the year, of which many were categorised as “others”. The DSAID team on topic modelling found that 30 per cent of these emails were actually on key collection, such as early key collection, defer key collection, proof of marriage defer key collection and urgent key collection.
From the data provided by DSAID, HDB created an online system for key collection through GovText, a product that makes use of natural language processing and was one of the very early use cases DSAID worked on together with HDB.
The data collected also indicated that older applicants prefer to defer key collections, while younger applicants prefer key collections. GovText was thus built around this, where its system clusters words into topics. The user can then decide and label this topic based on the understanding of the word crops.
Text analytics / Image credit: STACK-X Data Science Connect conference
Keywords of each topic can be assessed, together with the actual key rate associated with it. The team has also improved the GovText platform by adding other new text analytics services such as summarisation.
“[P]ublic officers can use this text portal and choose the text analysis services that they need and [get it done] very quickly,” Goh highlights.
6. Speech-to-Text transcription
Speech-to-Text transcription (STT) is aimed at improving three areas: the individual productivity of the public service officer through note-taking, note-sharing and key info searches; the organisational workflow through call-logging and reports generation; and citizens’ customer experience through virtual assistants and voice controls.
GovTech wraps a full stack of STT solutions to provide “consistent storefront” to users for seamless interaction with the system. They also benchmark and determine what is the best base model for the different types of complexities they have encountered.
Web portal offering STT services for public officers / Image credit: STACK-X Data Science Connect conference
In addition, the GovTech team has also reframed the models with agency data to be more specific for that particular agency. The STT team has also developed in-house AI components to improve the transcription output.
GovTech is currently working with external partners to co-develop a minimum viable product for an AI Speech platform as part of the Government Technology Stack.
This will see agencies co-funding for collaboration on respective use cases where the speech transcription model is contextualised, evaluating word error rates, and measuring the reduction in transcription efforts for agencies.
7. Video analytics
Goh shares that their team built an object detector and tracking algorithm that enables agencies to easily identify the modes of transport people use — whether by foot, bicycle or PMDs — for better urban planning, since it provides the numbers of the agencies’ target audience.
Object detector and tracking algorithm / Image credit: STACK-X Data Science Connect conference
Another use case in which video analytics is deployed is in the municipal services where the team helps agencies automatically classify thousands of images into just a few categories, streamlining the internal processes as well as the scale up operational throughput.
There is also the people counting video analytics on the edge, where crowd counts at strategic ingress or egress can be done to enhance situational awareness and capacity planning.
Some of the features the team has worked on include remote access to camera features like live feeds and reports, camera built-in trip-wire-based people counting VA, and the integration with URA SpaceOut and NParks SafeDistParks.
Moving towards a smarter nation
With Goh sharing more on the use cases of data science and AI at the STACK-X Data Science Connect conference, GovTech has definitely made improvements and efforts in enabling Singapore to become a smarter nation with collaboration among various government services.
Today’s government digital services are held to the highest standards by users. Not only must they be safe, secure and accurate, they also have to be easy to use and empowering.
As such, leveraging on data science and AI is a timely development, considering the rampant move towards the digital age.
Having digital transformation within the public sector as the heart of its operations, GovTech intends to continue harnessing the best info-communications technologies, data science and AI to make a difference to the everyday lives of people in Singapore.
Featured Image Credit: Cloud Expo Asia / DSAID GovTech via Medium
Source by vulcanpost.com