A closer look at AI-powered governments

Who would imagine that people – even the elderly or those with no knowledge about computers – would vote over this quirky new thing called the Internet 30 years ago? It’s already more than a decade-long practice in Estonia. Thirty years around the corner, today’s government practices would probably seem and feel archaic, even prehistoric. What can we expect of the future government the day after tomorrow, having in mind the Artificial Intelligence (AI) technologies and what’s already been cooking?

Last September, the ICEDA network had the opportunity to learn about the Estonian digital state miracle directly from those who are at the forefront of the art and practice of digital government, experimenting audaciously in directions often still unimaginable to the rest of the world. Champions were sharing their experiences, strategies, and some tips. This tiny and super innovative Baltic country invented and implemented something like a blockchain before Bitcoin was even invented, a distributed database technology that is one the of main ingredients of glorious e-Estonia. One thought that emerged from the discussion with the participants clung in the air, “People don’t like to interact with the government, it would be ideal to eliminate that contact!”. From personal experience, people are anxious every time they are supposed to interact with public administration employees in order to get an official stamp on a document. It’s not just waiting in lines and wasted mornings, but also the fact that it’s not necessary. In a perfect world, people would receive public services from the comfort of their homes with a single click done in milliseconds, and they would prefer this option to the traditional one. Next, “digital native” generations have different expectations of any old-fashioned bureaucracy systems. What would they expect – and demand – from a 2050 government and public administration?

Recent advances in digital technologies and their ample applications in our everyday lives are making the world a much different place. The speed of transformation, at least with “software is eating the world, is both deepening and accelerating. This article will focus on the use of AI in government and public services.

AI technologies 101

We are still far from the fantasied moment when machine intelligence surpasses human intelligence in all areas, but we often don’t understand that different AI systems are all already at our fingertips and all over our lives and that we can’t imagine a life without them even today.

When we “google”, the system is smart to read our intent and give us what we need in instant returns. When we use “Google maps”, the search algorithm is smart enough to provide us with the optimal route. We don’t see and know countless parameters analyzed in the background. That’s AI. A bunch of data points and advanced algorithms that use a ton of data to solve a particular problem, from winning a chess game to helping us flawlessly navigate the web and streets of the city we are in.

Say chess, where a machine, a computer program, can test the effectiveness of thousands of next moves while you consider which figure to move, and beat the best chess players in the world with ease. IBM’s Deep Blue AI crushed Gary Kasparov in 1997. Very narrow, but spectacular intelligence.

AI is in our phones. How else could our night photos be so beautiful? Algorithms make them nicer. How else could a phone be so much better than our eyes? New smartphones come with camera algorithms that make our photos better than ever before. They recognize faces, our location, or the position of the sun, so we don’t need to adjust settings manually. We can also speak with our phones and computers. iPhone users talk to Siri, while all Android users use Google’s voice digital assistants. You can also have conversations with Amazon’s Alexa, a home assistant you can ask about the weather next Thursday. This is called speech recognition. We can also chat with AI systems. The technique is called natural language processing, in which case computers understand natural language.

AI is in gaming, where it controls agents we fight with. It’s in finance, where it can rate our creditworthiness in minutes. In China, which seems ahead of the West in this, China’s mobile banks offer 1-second loan decisions. Not only that AI made the process lightning-fast, but it also made it possible for ordinary people and small business owners to gain access to banking which was much harder or even impossible before.

AI is a “digital beast”. When it encounters physical reality, it first digitizes it. “Computer vision is a field of AI that trains computers to interpret and understand the visual world.” Cameras on the street first record what’s happening, identify and classify objects in the video, and then, say, recognize people in it using facial recognition. The algorithm goes through existing databases of faces and recognizes identities by matching. This is potentially great for public safety but also very troublesome for privacy and control. The same tech is powering stores of the future, like in Amazon Go videos.

One of the next big things is autonomous, self-driving cars (cars that drive without humans at steering wheels). While still not there fully – city drives without any human intervention – many experts agree that it will happen in the next several years. That’s AI, and it will be the biggest change in transportation history. You could easily visualize robo taxissomething that Elon Musk and Tesla are already promising; You buy a car, and rent it as a robo taxi when you don’t drive it, and earn while you sleep.

AI is (already) everywhere. Some even call it new electricity precisely because of that. Whilst artificial in nature, it’s arguable that AI systems are – when designed right, using a so-called “human-centered design” – also very humane. Their purpose, among other things, is to free humans from unnecessary repetitive and boring tasks and enhance human potential.

AI is also already in the government in many forms, in the background, even in not so digitally advanced societies and governments, like the ones in the Balkans region.

One of the examples of a narrow intelligence would be in the currently ongoing vaccination process in Serbia. You enter your name, citizen ID and preferred city, and a simple algorithm sets the first available date and matches you with the vaccination location and then automates the communication with medical institutions and medical staff. Similar efforts have been done in other countries in the region, such as North Macedonia, but they are much less automated.

Smart use of algorithms in the WB governments can be found in the public e-procurement systems, professional and career advising, digital management of internal processes and procedures, detecting tax fraud, etc. AI elements can also be found in the judiciary. In North Macedonia, Automated Court Case Management Information System (ACCMIS) has been used since 2010 in all 34, replacing the manual distribution of cases. Its database is located at the Supreme Court and can be accessed by members of the Judicial Council and the presidents of the Basic Courts. As there have been many instances in the past where its usage was neglected, in the 2020 Report by the European Commission, one of the main recommendations for the Judiciary and fundamental rights (Chapter 23) in 2021 is to improve this system to ensure its full functionality and reliability.

The “alGOVrithms – State of play” and “alGOVrithms 2.0 – The State of Play” publications list some of the automated decision-making examples in Serbia, North Macedonia and Kosovo, and present policy recommendations for decision-makers to make it better, more accountable and more fair .

What is the government after all?

Like any other organization, the government is a group of people, the system that governs a state or a community. They do numerous little tasks every day, following the established and well-defined rules and procedures, with the intention to serve the public and citizens.

We could probably say that there are at least two parts to it. The first one (decision-makers, public officials) is political, which makes decisions, strategies, and policies, while the second one is more technical, professional. The second (the civil servants) is more “operative” – they identify, track, communicate requests and approvals, manage resources, do the calculations, give permissions, etc. If you look at it more deeply, you would probably figure out that so many of the tasks done within the government and public sector are repetitive and logical steps.

Did, let’s call him Filip, pay taxes? If not, when should he be notified? How much should he pay? What message should be sent to him, and what instruction? All this information exists in different systems and databases. If all of the data was digital and interconnected, a computer system boosted with powerful algorithms could automate large parts of the processes like that. Register a car, prove the ownership, start a company, or notarize valuable documents – so many of those functions can be automated and outsourced to computers. With a little help from advanced algorithms, governments and citizens can save so many hours and spend their valuable time doing more important things.

What happens when we apply AI technologies in government and public sector operations?

Why sum or divide large numbers by hand on a piece of paper, when you have a calculator? That’s probably the adequate analogy for many potential cases.

The Western Balkan countries are not part of the EU which has many different repercussions. One of them is how we pass borders at airports. Although we can move around freely thanks to “White Schengen”, we still wait in long lines at the airport. At some airports, EU citizens can just pass by showing their passports to the machine and looking at the camera. How is this possible? There is a shared database of faces and algorithms powered by facial recognition tech. Computer systems recognize your face and are smart to read your travel documents, so you can just continue walking. “Voila!” Borders can be extremely frustrating, so imagine how solutions like this one can minimize unnecessary human “suffering”.

What about some other examples of AI use in the government and public sector? Let’s get back to Estonia, our honorable mentions from the beginning of the article, where the government’s goal is to enable a ‘zero bureaucracy’ experience. How does Estonia use AI in delivering public services?

One of the really cool things they will unroll soon is #KrattAI AI voice assistants. Soon, Estonian citizens will be able to register companies by talking to their phones which will understand human language and be equipped to flawlessly navigate the whole registration process.

AI can help unemployed people as well. In one AI application, a machine learning system matches recently unemployed people with employees based on their skills. Resumes are fed into a system that matches their skills with potential employers. Computer-matching systems made an improvement: 72 percent of workers who got a job through the system are still on the job after six months, compared to 58 percent before the system was deployed.

There is this one very interesting example using satellite imagery and deep learning technique. European Space Agency satellites are used for inspection of farmers who receive government subsidies to cut their hayfields instead of physical inspection. Deep learning algorithms analyze imagery pixels to determine what’s happening on the ground, without the need for regular inspection visits. This system saved €665,000 of the public money in its first year. More about how the systems work can be learned in this presentation.

Let’s go to the other part of the world, to China, Asia.

One of the interesting concepts to note is augmented intelligence. To put it briefly, it’s a human-machine collaboration, where machine intelligence fuels and enhances human decisions.

Shenzhen, China, is one of tech’s capitals of the world. The city partnered with many private companies to implement sensors and smart cameras everywhere. Transportation is never late and everything is almost perfectly optimized using computer algorithms.

“Smart city brain” of LOC in Shenzhen

Data is streamlined to and analyzed by a big artificial “City Brain”. It’s then all visualized in a human-friendly way so government employees can use this superpower to oversee all the relevant dynamics at glance. There’s literally no way for a human to do all that math. Machines do it in milliseconds. Optimization and performance are at the maximum. A lot of great studies are written on AI in China, many of which raise a concern about surveillance and control.

Some Chinese face recognition technology is also used in Belgrade, Serbia. Whilst promising for public safety, if implemented without appropriate rules around it that protect privacy, it can be dangerous. One study analyzes this system and its potential dangers in Serbia. In other Western Balkans countries, these systems aren’t in place so far.

It’s a long road to any of the systems briefly presented above, packed with brave ideas, tons of testing and experimentation, security audits, impact assessments, research on potential biases, policy initiatives and changes, and improvements of regulatory and legal environments.

All AI systems require tons of quality data, digitized, and machine-readable data. Setting this up takes years. High-quality public-private partnerships are necessary because the most advanced skills and state-of-the-art technologies are built by private companies. All the examples of AI use in the public sector are designed and implemented in some sort of public-private partnership. Of course, nothing important on and large scale isn’t possible without the knowledge, skills, and openness of decision-makers. Not to mention legal requirements, security concerns, and accountability issues. It’s a long, strategic, decisive, and comprehensive course of action.

A wild and automated future is definitely ahead

“Innovation” and “digitalization” are keywords in most governments in the world, and the so-called “GovTech” market is booming. The $400 billion #GovTech market (Gartner) is full of startups and other companies doing technologies for the government, improving many aspects of it. Some estimates say it will hit a trillion dollars by 2025. When something is that big of a market, it means there’s a whole ecosystem around, made of entrepreneurs, investors, academia, experts, and even specific media. Many actors are pushing for the change. This means that innovation doesn’t just come from within the government, but is also pitched almost every day from tech innovators from all over the world.

Two relevant trends to think about, instead of a conclusion. It’s important to understand where the world is headed because it will change anyway. These trends also represent great opportunities for us to improve how we do things, including how governments serve the needs of their citizens.

Technology is developed at an extremely high pace. There’s so much money in the technology industry because everyone understands the market potential and the impact it. All the biggest companies in the world right now are technology companies and that probably won’t change ever.

Consumer habits and consumer demand also change. If AI is possible in our day-to-day lives, why isn’t it in the government? Citizens are consumers of government services, and they can and should demand better. If you can register a new company in Estonia online in minutes, or even vote online, why isn’t that happening elsewhere as well?

Having just those two trends in mind, we can expect a lot of fun and radical improvements in the way governments work in the following years and decades. We have just started. And while many of the examples just briefly touched on above might seem futuristic for countries in the Western Balkans region, a lot of local innovators are already thinking about stuff like that. The IT technology sector, being one of the strongest and most vibrant parts of the Western Balkans region economy, can play a huge role in this forthcoming transformation. A lot of international organizations are also funding, nurturing, and guiding government transformation and innovation in the region, European Union, and the ICEDA network is just a little piece in a big puzzle of a better, digitized future where things are smarter, cheaper, easier, and more life-friendly.

Daily Random Thoughts #8 – Relationships #1: “Ethical Relationships” (13/11/20)

I think a lot about something I call “ethical relationships” (the concept probably already exists?) recently.

What are “ethical relationships”? Let’s try to play with the concept.

1) Benefit of the Other

They are sum-sum games where both sides sincerely care about it each other. They both, whether lovers or friends, proactively do things that they think will benefit the Other. If one side feels better, the other feels laughter. Wins are shared.

2) Growth

Both sides grow as a consequence of the relationship. They become better.

Those relationships are brave, on an intimate level, in the sense that they step outside usual boundaries, which makes both parties stronger.

3) Transparency

I am fascinated with transparency as a principle. In the case of human relationships, it would mean that honesty has the highest value and that intentions are communicated clearly. Feelings are not hidden but shared. It makes everything (more) flawless. And faster. 

4) Good for the World

Besides being beneficial for both actors, they are also great for the World. People who appreciate and trust each other are good for their human environment. Energy spreads.

Daily Random Thoughts #7: Why Everyone Should Start an NGO as Early as Possible (11/11/20)

You will learn a ton and earn things money can’t buy.

What is an NGO? An organization working for a specific cause. Fighting for the cause is always a good thing. Learning about organizing (and people in general) is one of the most rewarding experiences because organizations are everywhere, and everything is an organization.  

This is a (unfinished!) list of the things you will learn: legal and taxes; fundraising; management; project management; human resources; strategic communications; marketing; accounting; public relations; public speaking; networking; leadership; sales; community organizing.

Sounds like a mini MBA? 

How to do it? List things you care about. May it be pets, astronomy (was my case in high school!), sport, startups, elders… Refugees, digital government, peace, or clean water. So many things that aren’t right and can be improved! Find those who care about the same thing(s). You’re not alone. You’re never alone. And just start. Learn along the way.

You’ll meet new friends, the most meaningful kind: those who share your values. From a strict investment perspective, you will benefit greatly. Up to a few times bigger income, if you decide to shift to business years later. Skills and relationships are probably the two most important things for any professional.

It’s important to found/co-found because then you need to do everything.  Also, the feeling when you realize that you have actually built something that matters from scratch. Phenomenal!

It’s a social capital game; personal gains are truly exponential, but it’s also great for society. People, together, becoming relevant and influencing society in positive ways.

I think I will write an in-depth essay on this!

Daily Random Thoughts #6: Changing – And Even Inventing – The Past (13/10/20)

I was so fascinated with the possibility of changing the Past that I wanted to do an MA on the subject once, 5 years ago. My angle would be propaganda focused on changing the past, and how operations like that influence behavior and the future.

Past should be – done? Fixed? Wrong. On both “historical” and personal level, the Past can change.

On a personal level.

I read “The Schopenhauer Cure” by Irvin D. Yalom in a day. It was my personal reading record back then, 400 pages in a day. At some point within a story, a very powerful Nietzsche’s thought is introduced.

“To change ‘it was’ into ‘thus I willed it’—that alone shall I call redemption.”

You are probably familiar with the genius music video for Massive Attack’s “Angel”?

It’s something like a revolution when things change like that. Past changes, as it’s only and always an interpretation, so the future becomes different. That past-future relationship is very interesting!

On a historical level.

I remember one particular lecture on the history of Serbian political thought while attending political science classes in Belgrade. Professor was naming early 19th-century Serbian philanthropists and explaining their contributions, a lot of them. After a while, he curiously asked us: “Do you feel better now? Did you know that these men were your fellows from the past?”. It was intriguing. We did not know any of them. And we did feel better.

How we understand ourselves, collectively, influence not just how we feel, but also how we will act in the future. That’s for example one of the functions of myths, right?

On a philosophical level.

Past is always “in relation to us”? Hence it’s always facts plus interpretation, and that interpretation is what matters the most. It opens so many questions about truth, but that’s a whole other ground.

Daily Random Thoughts #5: Abraham Lincoln And The Rabbit Hole (12/10/20)

Did you know that you can be a lawyer without even going to law school? Abraham was one of them. And did you know that you get a PhD even without a BA? And not just by founding a future trillion-dollar company, like Zuck of Bill.

That night I was interested in Abraham Lincoln, one of the most cherished US presidents. I love getting to know interesting, influential historical people, deeply. Abraham was such a masterful politician. He hacked his way to the Presidency and did so many splendid things. I watched this documentary as a first introduction and I will definitely dig more on him in the future. It was an inspiring night, I fell into a deep rabbit hall.

I did not know that Republicans were the ones against slavery and that Democrats were largely OK with it. A short video on the white supremacy legacy of the Democratic party. A short video on how the Republican party went from Lincoln to Trump. It’s interesting how things radically change over time. Lincoln was huge.

Then I was interested in Abraham’s (formal) education. You should read this incredible article about how to become a lawyer without going to law school. Lincoln was self-thought. So were many others. When you try to catch the bigger picture, you find so many great self taught individuals, and realize that it’s actually a very strong tradition, especially in the US.

In that regard, it’s interesting to think about what Peter Thiel is doing with his Thiel Fellowship. For those not familiar, it’s $100,000 to drop out, even from high-school. It does make sense: “Build new things instead of sitting in a classroom.” It’s at least 10x learning. Yet again, Peter Thiel, with whom I would not agree on so many points, claiming that higher education is an insurance policy, a bad one, actually. Google it, few very interesting hypotheses. 

Then I ended up reading about PhDs without MA – possible in the US –  and found out that you can even get a Doctorate of Philosophy without finishing undergraduate studies. Great ones are always an exemption, Ludwing Wittengetsin comes to my mind as first, but it’s possible even for “mortals”. For example, you can start digging here. It’s kind of obvious though, PhD should just be your authentic scientific contribution to the world; Why would anyone care if you paid a lot of cash and spent 3 years preparing for it? Have a great scientific contribution? And 3 professors and an institution to assess it, a procedure? Great: Thanks for your contribution, you’re PhD.

Then I stumbled upon an example of former MIT Media Lab Director, Joi Ito, who later resigned because of the tie with Jeffrey Epstein. He led one of the most interesting academic institutions in the world, well, at least without any significant formal academic credentials. Interesting bio. 

I love rabbit holes. You can get inspired and learn so much.

Daily Random Thoughts #4: What If You Could Clone And Multiply Yourself? (11/10/2020)

So, we’re in Croatia, on the dance floor. Hunee is playing trancy ambiental and obscure disco music. It’s lovely. I have this nasty habit of thinking while dancing. A close friend even said to me once: “People go out to get drunk and make a mess; you go out to dance and think!”

The music was still catching heights when I got lost in thinking about how wonderful it would be to clone yourself. Selling your time for cash, that’s frustratingly limited. What if there was a way to multiply it, several, even unlimited times, all in parallel?

Think about it. If all your work generates a digital output (design / code / writing / even voice), it wouldn’t matter if it was done by a machine, computer, instead of you. Let’s just say it’s on your behalf. If what’s truly you, skill-wise, and what you can do, could be learned and easily reproduced in novel contexts, that would be such a game-changer.

Ok, Something, an advanced AI system, would need to scan a) all of your past works and b) your current processes to be able to produce output as yours. For the past output, if it’s already digital – there’s enough data for the algorithm to be trained. Your style can be mastered. For the second one, would it be possible without brain-computer interfaces? Can your unique skills, as in the unique process behind them, be learned by a machine?

Then I got lost fantasizing how it would be so rad if it was possible. If possible:

It would be such empowerment for all those selling their digital time: Instead of, say, a few dollars an hour, it could be much more. More equality, and opportunities, as a consequence.

From the perspective of the whole economy: What a productivity boost that would be! All the great and necessary things could infinitely accelerate.

Elon Musk is a big-league psycho, in a good, even magnificent way. Neuralink – this is obviously an afterthought because Neuralink wasn’t born when that dance happened – might be the part of the solution here. Or something similar. General AI technologies like GPT-3 or some future GPT-9 might be part of the alchemic equation.

I concluded that it will be possible in the future in one way or another and that it will be mega exciting, and continued to dance.

Daily Random Thoughts #3: Superskills and Sales In Particular (10/10/2020)

My dear friend Lav surprised me with “The Last Safe Investment”, a book on contemporary life, a few years ago. One of the most interesting things in it is the exploration of so-called superskills. Those are skills that are universally applicable and can bring exponential returns. One of them is particularly interesting: sales.

It’s one of the most rewarding ones. Everyone should learn it. It’s a colossal investment.

First of all, it’s quite universal. The world will always need salespeople because sales is what makes the economy work. Any industry needs it. Any business needs it. Market cries for skilled salespeople. You can practice in your streets, selling candies. I am not kidding 🙂 Great salespeople can become really wealthy. If that’s your goal.

You don’t have to be a salesperson, it doesn’t have to be your career choice. It can be stressful, and it doesn’t fit the character of many. Sales skill is probably the greatest multiplier of all the skills. No matter what you are good at and what your passion is, add a sales superskill to it and everything will be multiplied. Academic? Publish your Ph.D. work in top media publications and that’s a few thousand $$ more. Programmer? How about earning 3 times more? It’s sales. Student? Get that prestige scholarship and paid internship. Entrepreneurship is mostly about sales. Sales get you better hires and media hype. 

What is sales? Action that leads to the desired transaction. And, if you think about it more deeply, everything is a transaction. Date,  your salary, any pitch, no matter how naive or big.

For more on superskills you should read the book. I would definitely recommend it.

Daily Random Thoughts #2: Luck is More Accessible Than Ever (09/10/20)

Some people are in the right place at the right time, and just that, sometimes even literally, makes them millionaires. Some are lucky in a sense that they never experience poverty or anything close to it, or because they’re born athletic, with an IQ of 180, or because they were loved by both parents, or that they had parents around them at all. Just a random thing like country of birth matters a hell lot. 

I don’t believe that what we are given as a starting point can’t be totally, and I mean totally, changed. Even luck-wise. Nietzsche was my favorite philosopher for a while, I did graduate work on his philosophy in high school. I deeply believe in so-called “voluntarism”, Will being the mightiest force in the universe. If we can decide to be luckier, how do we do it?

I believe that the keyword is “network”. And exactly that’s why I think we live in an age when luck is more accessible than ever before. 

What is luck after all? A random, largely unexpected major win? If you’re exposed more, it can hit you more often. It’s a probability thing. If it’s a probability thing than the Internet and all of its products are the best thing ever. And if it’s mostly about networks, then our conscious, decisive plays can make a significant change. It can become some sort of a personal revolution.

The Universe, World, God, Life, or whatever you believe in, can be atrocious to you. Still, I do believe, you can, proactively and directly influence the level of your Luck. You first need to be (self)conscious about it. 

All the opportunities (luck being one of them) float through networks, as information. That fabulous, life-changing opportunity, someone serving you a 10x or 100x wizardry in a second, it’s there, somewhere within the network. Those networks are primarily made of people. People talk to each other, people share, people love and enjoy to give. Many get lucky along the way, without a conscious intention. So many Lucks occur as consequences of random interactions.

Let’s focus on the “right person in the right place at the right time” for a bit. Ok, it’s about location, and that’s highly relative, in a sense that borders often act as the evil cages. Then migrate. As more places where things actually happen move online (FB groups, Discord or Slack channels, forums, email lists and so on), it’s easier than ever before to be where the magic happens. Luck strikes in magical places. You can find them.

Why are some people luckier? The first reason is that they are there. The second, obvious one, is that they have access to people other mortals don’t. Think of alumni networks, rich parents, friends of parents, homies from the hood, kindergarten bros and sis. Hence the keyword is “network”. You can’t compete with that. You probably won’t be the luckiest person in the world. But you can, I am 100% sure of it, decide to be luckier, and act upon it.

What can one for example do to become luckier?

Learn to network. Do it all the time.

I believe this is the most important. It’s all about relationships. That is what makes some privileged, right? The truth is: anyone can do it. Invest in people. Find your tribe. The world is composed of tribes. Find yours. And then, on some random day, someone will casually mention something that will change your life forever and you’ll be lucky.

Even highly introverted people are not that shy online. You don’t have to speak in front of the 500 people you don’t know. You can use hashtags on twitter. Or Instagram. Or LinkedIn. Whatever one’s passions and interests are, there’s probably a very high probability that there are hundreds of others around the planet who share them. That fit is natural and it’s really powerful. Welcome to your network!

Be more open.

Some people see things others don’t and act when they smell Luck. They’re luckier because they’re more open. Be aware of your environment and the world around you. Listen. Watch. Many people don’t see at all and it’s such a shame. Say YES and “why not?” more often. Anyone can become more open. It’s super easy. Drives Luck as well.


I love the content on the Internet. It’s the most egalitarian thing in the world: it doesn’t matter if you have a Ferrari in your garage or Prada shoes on your feet. It’s cheap to produce. The magnificent ways content can lead to Luck are nuts. One morning you can receive a life-changing message because someone has stumbled upon you accidentally and, oh, you’re lucky!

Learn to be more proactive.

I guess it takes a while, but it’s achievable. It’s a prerequisite for so many great things in life. The gym is one of the little hacks. There are so many of them.

Seriously, what are you doing today to be luckier tomorrow? The world with luckier people would be a much happier place to live in. Everyone should try to learn to be luckier.

Daily Random Thoughts #1: What If It’s Just Nonsense? (08/10/2020)

“Conspiracy theories” are not always just paranoia and overdramatizing. People do try to conspire when then can. The history of secret diplomacy would be one instance, where countries agree on borders and other important stuff behind the back of the rest of the world. Or when business leaders illegally draft secret masterplans over coffee in a random hotel bar. It’s natural, and rational, sometimes even legitimate.

What’s much more interesting though, on the opposite side of the spectrum, is where the reason for X is pure stupidity. Quite often, I assume, unexpected personal reasons demolish countries, too. Sometimes it’s just nonsense.

One of my favorite movies is “I Stand Alone”, written and directed by Gaspar Noé. In short: a series of misunderstandings ruins a guy’s life. It kept me wondering a lot.

A few days ago I heard a very interesting story about one surreal almost existential risk episode. Some civilian rebels in Pakistan, I was told, captured a building without knowing that the nuke was inside. Shit happens for no particular reason.

It might not be anybody’s intention, a plan, but rather “just nonsense”.

Wikidata and the Next Generation of the Web – Goran Milovanović, Data Scientist for Wikidata @Wikimedia Deutschland

The world without Wikipedia would be a much sadder place. Everyone knows about it and everyone uses it – daily. The dreamlike promise of a great internet encyclopedia, accessible to anyone, anywhere, all the time – for free, has become a reality. Wikidata is a lesser-known part of the Wikimedia family, represents a data backend system of all Wikimedia projects, and fuels Apple’s Siri,  Google Assistant, and Amazon’s Alexa among many other popular and widely-used applications and systems.

Wikipedia is one of the most popular websites in the world. It represents everything glorious about the open web, where people share knowledge freely, generating exponential benefits for humanity. Its economic impact can’t be calculated; being used by hundreds of millions, if not billions of people worldwide, it fuels everything from the work of academics to business development.

Wikipedia is far more than just a free encyclopedia we all love. It’s part of the Wikimedia family, which is, in their own words: “a global movement whose mission is to bring free educational content to the world.” To summarize its vision: “Imagine a world in which every single human being can freely share in the sum of all knowledge.”

Not that many people know enough about Wikidata, which acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.

Goran Milovanović is one of the most knowledgeable people I know. I invited him to lecture about Data Science potential in the public administration at the policy conference that I organized five years ago. We remained friends and I enjoy talking with him about everything that has ever popped in the back of my head. Interested in early Internet development and the role of RAND in immediate postwar America? No worries, he’ll speak 15 minutes about it in one breath.

Goran earned a Ph.D. in Psychology (a 500+ pages long one, on the topic of Rationality in Cognitive Psychology) from the University of Belgrade in 2013, following two years as a graduate student in Cognition and Perception at NYU, United States. He spent a lot of years doing research in Online Behaviour, Information Society Development, and Internet Governance, co-authoring 5 books about the internet in Serbia. He provides consultancy services in Data Science for Wikidata to Wikimedia Deutschland since 2017, where he is responsible for the full-stack development and maintenance of several analytical systems and reports on this complex knowledge base, and runs his own Data Science boutique consultancy DataKolektiv from Belgrade.

He’s a perfect mixture of humanities, math, and engineering, constantly contemplating the world from a unique perspective that takes so many different angles into account. 

We focused our chat around Wikidata and the future of the web, but, as always, touched many different phenomena and trends.

Before we jump to Wikidata: What is Computational Cognitive Psychology and why were you so fascinated with it?

Computational Cognitive Psychology is a theoretical approach to the study of the mind. We assume that human cognitive processes – processes involved in the generation of knowledge: perception, reasoning, judgment, decision making, language, etc. – can essentially be viewed as computational processes, algorithms running not in silico but on a biologically evolved, physiological hardware of our brains. My journey into this field began when I entered the Department of Psychology in Belgrade, in 1993, following more than ten years of computer programming since the 80s and a short stay at the Faculty of Mathematics. In the beginning, I was fascinated by the power of the very idea, by the potential that I saw in the possible crossover of computer science and psychology. Nowadays, I do not think that all human cognitive processes are computational, and the research program of Computational Cognitive Psychology has a different meaning for me. I would like to see all of its potential fully explored, to know the limits of the approach, and then try to understand, do describe somehow, even intuitively only, what was left unexplained. The residuum of that explanatory process might represent the most interesting, significant aspect of being human at all. The part that remains irreducible to the most general scientific concept that we have ever discovered, the concept of computation, that part I believe to be very important. That would tell us something about the direction that the next scientific revolution, the next paradigm change, needs to take. For me, it is a question of philosophical anthropology: what is to be human? – only driven by an exact methodology. If we ever invent true, general AI in the process, we should treat it as a by-product, as much as the invention of the computer was a by-product of Turing’s readiness to challenge some of the most important questions in the philosophy of mathematics in his work on computable numbers. For me, Computational Cognitive Psychology, and Cognitive Science in general, do not pose a goal in themselves: they are tools to help us learn something of a higher value than how to produce technology and mimic human thinking.

What is Wikidata? How does it work? What’s the vision?

Wikidata is an open knowledge base, initially developed by parsing the already existing structured data from Wikipedia, then improved by community edits and massive imports of structured data from other databases. It is now the fastest-growing Wikimedia project, recently surpassing one billion edits. It represents knowledge as a graph in which nodes stand for items and values and links between them for properties. Such knowledge representations are RDF compliant, where RDF stands for Resource Description Framework, a W3C standard for structured data. All knowledge in systems like Wikidata takes a form of a collection of triples, or basic sentences that describe knowledge about things – anything, indeed – at the “atomic” level of granularity. For example, “Tim Berners-Lee is a human” in Wikidata translates to a sentence in which Q80 (the Wikidata identifier for “Tim Berners Lee”) is P31 (the Wikidata identifier for the “instance of” property of things) of Q5 (the Wikidata identifier for a class of items that are “humans”). So, Q80 – P31 – Q5 is one semantic triple that codifies some knowledge on Sir Timothy John Berners-Lee, who is the creator of the World Wide Web by the invention of the Hypertext Transfer Protocol (HTTP) and 2016. recipient of the Turing Award. All such additional facts about literally anything can be codified as semantic triples and composed to describe complex knowledge structures: in Wikidata, HTTP is Q8777, WWW is Q466, discoverer or inventor is P61, etc. All triples take the same, simple form: Subject-Predicate-Object. The RDF standard defines, in a rather abstract way, the syntax, the grammar, the set of rules that any such description of knowledge must follow in order to ensure that it will always be possible to exchange knowledge in an unambiguous way, irrespectively of whether the exchange takes place between people or computers.

Wikidata began as a project to support structured data for Wikipedia and other Wikimedia projects, and today represents the data backbone of the whole Wikimedia system. Thanks to Wikidata, many repetitions that might have occurred in Wikipedia and other places are now redundant and represent knowledge that can be served to our readers from a central repository. However, the significance of Wikidata goes way beyond what it means for Wikipedia and its sisters. The younger sister now represents knowledge on almost one hundred million things – called items in Wikidata – and grows. Many APIs on the internet rely on it. Contemporary, popular AI systems like virtual assistants (Google Assistant, Siri, Amazon Alexa) make use of it. Just take a look at the number of research papers published on Wikidata, or using its data to address fundamental questions in AI. By means of the so-called external identifiers – references from our items to their representations in other databases – it represents a powerful structured data hub. I believe Wikidata nowadays has the full potential to evolve into a central node in the network of knowledge repositories online.

Wikidata External Identifiers: a network of Wikidata external identifiers based on their overlap across tens of millions of items in Wikidata, produced by Goran S. Milovanovic, Data Scientist for Wikidata @WMDE, and presented at the WikidataCon 2019, Berlin

What’s your role in this mega system? 

I take care about the development and maintenance of analytical systems that serve us to understand how Wikidata is used in Wikimedia websites, what is the structure of Wikidata usage, how do human editors and bots approach editing Wikidata, how does the use of different languages in Wikidata develop, whether it exhibits any systematic biases that we might wish to correct for, what is the structure of the linkage of other online knowledge systems connected with Wikidata by means of external identifiers, how many pageviews we receive across the Wikidata entities, and many more. I am also developing a system that tracks the Wikidata edits in real-time and informs our community if there are any online news relevant for the items that are currently undergoing many revisions. It is a type of position which is known as a generalist in the Data Science lingo; in order to be able to do all these things for Wikidata I need to stretch myself quite a bit across different technologies, models and algorithms, and be able to keep them all working together and consistently in a non-trivial technological infrastructure. It is also a full-stack Data Science position where most of the time I implement the code in all development phases, from the back-end where data acquisition (the so-called ETL) takes place in Hadoop, Apache Spark, SPARQL, through machine learning where various, mostly unsupervised learning algorithms are used, towards the front-end development where we finally serve our results in interactive dashboards and reports, and finally production in virtualized environments. I am a passionate R developer and I tend to make use of the R programming language consistently across all the projects that I manage, however it ends up being pretty much a zoo in which R co-exists with Python, SQL, SPARQL, HiveQL, XML, JSON, and other interesting beings as well. It would be impossible for a single developer to take control of the whole process if there were no support from my colleagues in Wikimedia Deutschland and the Data Engineers from the Wikimedia Foundation’s Analytics Engineering team. My work on any new project feels like solving a puzzle; I face the “I don’t know how to do this” situation every now and then; I learn constantly and the challenge is so motivating that I truly suspect there can be many similarly interesting Data Science positions like this one. It is a very difficult position, but also one professionally very rewarding.       

If you were to explain Wikidata as a technological architecture, how would you do it in a few sentences? 

Strictly speaking, Wikidata is a dataset. Nothing more, nothing less: a collection of data represented so as to follow important standards that makes it interoperable, usable in any imaginable context where it makes sense to codify knowledge in an exact way. Then there is Wikibase, a powerful extension of the MediaWiki software that runs Wikipedia as well as many other websites. Wikibase is where Wikidata lives, and where it is served from wherever anything else – a Wikipedia page, for example – needs it. But Wikibase can run any other dataset that complies to the same standards as Wikidata, of course, and Wikidata can inhabit other systems as well. If by the technological architecture you mean the collection of data centers, software, and standards that make Wikidata join in wherever Wikipedia and other Wikimedia projects need it – well, I assure you that it is a huge and a rather complicated architecture underlying that infrastructure. If you imagine all possible uses of Wikidata, external to the Wikimedia universe, run in Wikibase or otherwise… then… it is the sum of all technology relying on one common architecture of knowledge representation, not of the technologies themselves.

How does Wikidata overcome different language constraints and barriers? It should be language-agnostic, right?

Wikidata is and is not language-agnostic at the same time. It would be best to say that it is aware of many different languages in parallel. At the very bottom of its toy box full of knowledge, we find abstract identifiers for things: Q identifiers for items, P identifiers for properties, L for lexemes, S for senses, F for forms… but those are just identifiers, and yes they are language agnostic. But things represented in Wikidata do not have only identifiers, but labels, aliases, and descriptions in many different languages too. Moreover, we have tons of such terms in Wikidata currently: take a look at my Wikidata Languages Landscape system for a study and an overview of the essential statistics.

What are the knowledge graphs and why they are important for the next generation of the web?

They are important for this generation of the web too. To put it in a nutshell: graphs allow us to represent knowledge in the most abstract and most general way. They are simply very suitable to describe things and relations between them in a way that is general, unambiguous, and in a form that can quickly evolve into new, different, alternative forms of representation that are necessary for computers to process it consistently. By following common standards in graph-based knowledge representation, like RDF, we can achieve at least two super important things. First, we can potentially relate all pieces of our knowledge, connect anything that we know at all so that we can develop automated reasoning across vast collections of information and potentially infer new, previously undiscovered knowledge from them. Second, interoperability: if we all follow the same standards of knowledge representation, and program our APIs that interact over the internet so to follow that standard, then anything online can easily enter any form of cooperation. All knowledge that can be codified in an exact way thus becomes exchangeable across the entirety of our information processing systems. It is a dream, a vision, and we find ourselves quite far away from it at the present moment, but a one rather worth of pursuit. Knowledge graphs just turn out to be the most suitable way of expressing knowledge in a way desirable to achieve these goals. I have mentioned semantic triples, sentences of the Subject-Predicate-Object form, that we use to represent the atomic pieces of knowledge in this paradigm. Well, knowledge graphs are just sets of connected constituents of such sentences. When you have one sentence, you have a miniature graph: a Subject points to the Object. Now imagine having millions, billions of sentences that can share some of the constituents, and a serious graph begins to emerge.

A part of the knowledge graph for English (Q1860 in Wikidata)

Where do you think the internet will go in the future? What’s Wikidata’s role in that transformation?

The future is, for many reasons, always a tricky question. Let’s give it a try: on the role of Wikidata, I think we have clarified that in my previous responses: it will begin to act as a central hub of Linked Open Data sooner or later. On the future of the Internet in general, talking from the perspective of the current discussion solely: I do not think that the semantic web standards like RDF will ever reach universal acceptance, and I do not think that is even necessary for that to happen to enter the stage of internet evolution where complex knowledge is almost seamlessly interacting all over the place. It is desirable, but not necessary in my opinion. Look at the de facto situation: instead of evolving towards one single, common standard of knowledge and data representation, we have a connected network of APIs and data providers exchanging information by following similar enough, easily learnable standards – enough to not make software engineers and data scientists cry. Access to knowledge and data will ease and governments and institutions will increasingly begin to share more open data, increasing the data quality along the way. It will become a thing of good manners and prestige to do so. Data openness and interoperability will become one of the most important development indicators, tightly coupled with questions of fundamental human rights and freedoms. To have your institution’s open data served via an API and offered in different serializations that comply with the formal standards will become as expected as publishing periodicals on your work is now. Finally, the market: more data, more ideas to play with.

A lot of your work is relying on technologies used in natural language processing (NLP), typically handling language at scale. What are your impressions of Open AI’s GPT-3 which is quite a buzz recently? 

It is fascinating, except for it works better in language production that can fool someone than in language production that exhibits anything like the traces of human-like thinking. Contemporary systems like GPT-3 make me think if the Turing test was ever a plausible test to detect intelligence in something – I always knew there was something I didn’t like about it. Take a look, for example, at what Gary Marcus and Ernest Davis did to GPT-3 recently: GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about. It is a clear example of a system that does everything to language except for it does not understand it. Its computational power and the level up to which it can mimic the superficial characteristics of the language spoken by a natural speaker are fascinating. But it suffers – and quite expectedly, I have to add – from a lack of understanding of the underlying narrative structure of the events, the processes it needs to describe, the complex interaction of language semantics and pragmatics that human speakers face no problems with. The contemporary models in NLP are all essentially based on an attempt to learn the structure of correlations between linguistic constituents of words, sentences, and documents, and that similarity-based approach has very well known limits. It was Noam Chomsky who in the late 50s – yes, 50s – tried to explain to the famous psychologist B. F. Skinner that just observing statistical data on the co-occurrences of various constituents of language will never provide for a representational structure powerful enough to represent and process natural language. Skinner didn’t seem to care back in time, and so didn’t the fans of the contemporary Deep Learning paradigm which is essentially doing exactly that, just in a way orders of magnitude more elaborated than anyone ever tried. I think we are beginning to face the limits of that approach with GPT-3 and similar systems. Personally, I am more worried about the possible misuse of such models to produce fake news and fool people into silly interpretations and decisions based on false, simulated information, than to question if GPT-3 will ever grow up to become a philosopher because it will certainly not. It can simulate language, but only the manifest characteristics of it; it is not a sense-generating machine. It does not think. For that, you need some strong symbolic, not connectionist representation, engaged in the control of associative processes. Associations and statistics alone will not do.

Do humanities have a future in the algorithmic world? How do you see the future of humanities in the fully data-driven world?

First, a question: is the world ever going to be fully data-driven, and what does that mean at all? Is a data-driven world a one in which all human activity is passivized and all our decisions transferred to algorithms? It is questionable if something like that is possible at all, and I think that we all already agree that it is certainly not desirable. While the contemporary developments in Data Science, Machine Learning, AI, and other related fields, are really fascinating, and while our society is becoming more and more dependent upon the products of such developments, we should not forget that we are light years away from anything comparable to true AI, sometimes termed AGI (Artificial General Intelligence). And I imagine only true AI would be powerful enough to run the place so that we can take a permanent vacation? But then comes the ethical question, one of immediate and essential importance, of would such systems, if they ever come to existence, be possible to judge human action and act upon the society in a way we as their makers would accept as moral? And only then comes the question of do we want something like that in the first place: wouldn’t it be a bit boring to have nothing to do and go meet your date because something smarter than us has invented a new congruence score and started matching people while following an experimental design for further evaluation and improvement?

Optimization is important, but it is not necessarily beautiful. Many interesting and nice things in our lives and societies are deeply related to the fact that there are real risks that we need to take into account because irreducible randomness is present in our environment. Would an AI system in full control prevent us from trying to conquer Mars because it is dangerous? Wait, discovering the Americas, the radioactive elements, and the flight to the Moon were dangerous too! I can imagine that humanity would begin to invent expertise in de-optimizing things and processes in our environments if some fully data-driven and AI-based world would ever come to existence. Any AI in a data-driven world that we imagine nowadays can be no more than our counselor, except for if the edge case that it develops true consciousness turns out to be a realistic scenario. If that happens, we would have to seriously approach the question of how to handle our relationship to AI and its power ethically.

I do not see how humanities could possibly be jeopardized in this world that is increasingly dependent on information technologies, data, and automation. To understand and discover new ways to interpret the sense generating narratives of our lives and societies was always a very deep, a very essential human need. Does industrialization, including our own Fourth age of data and automation, necessarily conflicts with the world aware of the Shakespearean tragedy, as it does in Huxley’s “Brave New World”? I don’t think it is necessarily so. I enjoy the dystopian discourse very much because I find that it offers so many opportunities to reflect upon our possible futures, but I do not see us living in a dystopian society anytime soon. It is just that somehow the poetics of the dystopian discourse are well-aligned, correlated with the current technological developments, but if there ever was a correlation from which we should not infer any fatalistic causation that is the one.