Dmytro Taranovsky
Back to my homepage.
In progress draft; comments welcome. Oct 3, 2023 (2024 update coming soon)

AI Notes

About this paper: Below are some of my notes on AI, with an emphasis on LLMs, written in 2023. While some of the notes are technical, most of it is about understanding and analyzing AI, including capabilities, trajectory, possible futures, alignment, and the societal context.

The societal context: Previously, I had expected that the 2000s decade would correspond to computerization, 2010s to automation, and 2020s to AI. However, automation was largely blocked by lack of AI, exacerbated by the complexity and lack of coordination in the world, and by the need (in many cases) to get all rather than almost all steps right. Moreover, there were adverse societal changes (including authoritarianism in Russia and China, and risk of authoritarianism in the USA) plus I better understand the human darkness. The problem is not just lack of information, solvable by the internet, but a fundamental lack of judgement (which the internet can make better or worse). Moreover, the speed of technological progress (including the slowed down Moore's law) did not seem fast enough. Without AI, continued technological advancement might be overtaken by resource depletion, population increases, and societal problems. Against the somewhat bleak background, AI forms a bright spot; and I now expect AI driven economy acceleration in the 2020s. AI acceleration may be necessary to prevent our fragile complex interconnected structurally-imbalanced substantially-irrational rapidly-changing world from falling apart.

Self-driving cars delay: The delay is caused by the need for high reliability (crashing once a year is not good enough) combined with a tail of diverse rare events. By contrast, LLMs (large language models) are useful despite being unreliable, which partially explains their general arrival before self-driving cars. Still, emerging large multimodal models will likely have good out-of-distribution generalization performance, which may be the missing ingredient for self-driving cars and robotics. Also, assuming good connectivity, self-driving cars may work fine even if they need to be controlled by remote humans a small fraction of the time (e.g. talking to a police officer after being pulled over).
Self driving cars applications: Self-driving cars will free up drivers to do other things, reduce traffic fatalities, increase freedom (and lower prices) for those who cannot drive or otherwise use ride sharing, and once enough cars are self-driving, more than double road capacity through virtual trains.

Large language models applications: While new, LLMs (including ChatGPT, Claude, etc) are not a fad because they are useful already (early 2023) for improving one's writing skills (including creative writing), summarization, software coding assistance, and even learning. It takes time to figure out the right prompts and applications, and to develop good user interfaces.
Reliability of LLMs: If LLMs could be made factually reliable, they would revolutionize knowledge acquisition and education. LLM output tends to be better organized and more readable than Wikipedia articles, and LLMs can also act as tutors. You could ask LLM a question, and get the consensus of human experts on the question (including dissent), or insight on what the consensus would be if the question was asked of them. Reliability of LLMs is improving, but technologies often develop on an S-curve, plateauing at limited usefulness, and high reliability plausibly needs new LLM architectures and/or training methods. Reliability is also made difficult by the prevalence of low quality content on the internet.

Competency of LLMs: LLMs have human level linguistic ability (or that is how it appears to an ordinary user), but they lack common sense. They kind-of understand the text, but only at a primitive level. They can use reasoning that imitates what they see in the training, but fail once they are steered from the beaten track, and doubly so if the question looks similar to but has a different answer from a commonly answered question. Standardized tests (such as AP exams) are designed for humans, and play to machine advantage by using limited-complexity questions similar to those answered many times before. GPT-4 may have been top 10% on a bar exam, but would be quickly disbarred as a lawyer.
Data availability and LLM competency areas: LLM competency is heavily influenced by data availability. LLMs are much less data efficient than humans. Basically all text uses language, hence high language use competency. LLMs are good at coding because of a lot of online coding examples, plus something about what typical code is, perhaps its stereotyped and nonhumanlike patterns (and with automated testing, it may also be easier to fine-tune LLM coding ability). By contrast, factual data, especially about off-the-beaten-track questions, is much sparser, hence limited competence.
Competency frontier: As AI progresses, differences from humans may become more subtle and recede to areas of higher complexity. An AI might be eerily human-like in short conversations without having human-level learning ability, and thus be unable to replace typical human jobs. Note that standardized tests naturally lean towards simpler scenarios. It is unclear what will happen when AI competency levels reach complexities at which humans ordinarily operate (but absent a disaster or infeasibly high compute costs, it will give a dramatic productivity improvement).
Humanlike errors of LLMs: LLM errors are remarkably human-like. Like humans, and unlike previous ordinary software, LLMs tend to be notoriously unreliable and not good at arithmetic. LLM reasoning is also human-like (but limited) as it was trained using human reasoning. LLM internal representations plausibly store human-like concepts.

Language modeling vs reasoning: Before large language models, fluent language use might have appeared to be the quintissential specially human ability. Language distinguishes humans from animals. However, the areas specifically responsible for language, namely Wernicke's and Broca's areas, form less than 3% of the human brain, or a bit over 1% if we only include the areas in the dominant hemisphere. Current (2023) LLMs can perhaps be seen as what happens with good language modeling and a large factual base but without general reasoning.

Usefulness of current LLMs vs AGI: The broad usefulness of LLMs is not because they approach AGI (they clearly do not) but because so much of human interactions dooes not actually require human-level intelligence, especially with large relevant data sets available. Future LLMs and other models can be transformative even without a breakthrough in reasoning.

Freedom and regulation: Freedom of speech protects acquisition of data, and development/training and sharing of AI. Freedom of speech is a fundamental right, and not a privilege bestowed by the government. Once you share your data with someone, it becomes their data to use and share (at least unless they agree to keep it confidential or pretend to agree through a deceptive privacy policy). Still, much of the data is confidential and would benefit from a big-data-friendly pro-AI privacy-respecting framework. With so much intolerance in the world, privacy is important, and privacy concerns are a key obstacle for a substantial fraction of AI use (including you having a searchable record of everything you say and hear).
Two more related points:
* Regulation (including self-regulation) related to AI may be important where reliable decisions are needed (e.g. doctor's review of AI recommended treatment) even though the AI systems themselves are protected by freedom of speech.
* Second, even ignoring freedom, imposing excess liability on AI companies can easily stifle availability because a website (or a similar company) often gains only a small fraction of the benefit it produces, and thus cannot afford even a fraction of the costs even when the benefits clearly outweigh the costs.

Moderation and LLMs: Current language models, such as GPT-4, are not that harmful even if the content filters were removed. It is often good to warn and correct the user. However, analogously to freedom of speech, there should be universal tools, including AI, that follow the user even if the speech is contrary to the AI creator's corporate values. Fortunately, some early LLM AI deployments moderate based on the prompt rather than prompt+response, and there are limitless ways to write seemingly innocuous (as judged by LLM) prompts that trigger 'forbidden' responses, giving users a taste for forbidden content.

Harms of LLMs: LLMs can generate a large quantity of spam and other low-quality content that degrades the internet. They can also be used for propaganda, which needs only content that looks plausible rather than accurate. They may also ease school cheating, though preferably students would write essays because they want to learn and express themselves, rather than just for a good grade. To counteract spam, it is important to have proofs of humanity. One way is to use accounts linked to an account that has verified that you are human. Under standard cryptographic assumptions, using zero knowledge systems, this can be made to work even for anonymous speech.
Harms of image generators: Image generators, though they are not there yet, especially for video, may generate fake realistic images, degrading our ability to tell truth from fiction. Some people also worry that image generators may be used to 'undress' people and the like, but the societal effect of this will likely be increased acceptance of the human body.

Racial bias: AI bias, including racial bias, is hard to solve, as it is hard to get unbiased conclusions from biased data/supervision. However, some approaches are:
- Omit race and related characteristics as input (assuming the model would not otherwise learn them). For example, do not include applicant photos in job applications.
- Understand whether the model learns race, and if so, whether it is biased. For example, surprisingly, AI can learn race from chest X-rays, which can lead to bias if AI predicts suggested treatment, and that treatment in the data is racially biased.
- Learn dependent variables that are less prone to racial bias.
- Some algorithms are more prone to bias than others. For example, naive Bayesian can produced biased outputs if race is correlated with the dependent variable, even for unbiased supervision.
- Include features that account for the correlation between race and the dependent variable. For example, if poor performance of blacks is accounted by their socioeconomic status, including parts of that status may prevent models from putting weight on race.
- In many cases, the model can be pre-trained on a larger biased dataset, and then fine-tuned on a smaller less-biased one.
* For protected uses (such as hiring), one question is do we require merely that bias is not purposefully coded into the models, or require reasonable care to mitigate the bias in the models (my preferred option), or prohibit bias outright (not really workable).

AI funding acceleration: The arrival and (genuine if limited) usefulness of ChatGPT has boosted attention to AI, which will likely accelerate AI funding and development.

Versions and commercialization of LLM: Given the high computational cost, there will be multiple versions and price points. The base version will be essentially free (e.g. ChatGPT). The free version may use the latest models after scaling down, including by drastically pruning less useful neurons (or even subnetworks) and continuing training (distillation). In the other direction (i.e. high quality), instead of using a best-1 output, multiple possible outputs will be produced and critiqued by the model (and by other models), much like having a well-edited story versus a stream-of-consciousness.

LLM poisoning by targeted misinformation: LLMs are at risk of targeted misinformation, for example if a lot of websites say a particular bad thing about a person (but using different wording to avoid duplication), LLMs might pick it up, even if it is implausible and debunked.

Performance metrics: While some research gives states of the art absolute performance, it is also important to research performance for various compute, data, and (informally) conceptual complexity limits, and thereby advance performance vs constraints curves.

Sequential reasoning: Current neural networks, including deep ones, corresponds to a flash of perception (this agrees with models well tolerating a wide range of depths for a fixed number of parameters). Chain-of-thought prompting, or simply generation of the answer one token at a time, gives sequential reasoning ability. Similarly, diffusion imaging breaks the problem into sequential steps.

GAN: Generative adversarial networks (GANs) and related approaches should be useful for augmenting image creation (including diffusion models) and language models. Note that LLM reinforcement learning (RL) to avoid harmful outputs has similarities to GANs. However, GANs can be prone to failures (such as mode collapse) and need to be architectured carefully. For LLM, probability miscalibrations caused by RL can likely be mitigated by continuing the base training in parallel with RL.

Quality saturation in ML: In machine learning, it often happens that once you reasonably use all important sources of data, the output quality saturates and becomes hard to improve further (other than getting more data). It is more subtle in neural networks, as network size matters, you need the right architecture to make sense of the data, and scaling up can be hard. The benefit of transformers is their reasonable competency to use nonlocal information in the text, but given this competency, a number of choices (such as depth vs width within reasonable parameters) appear to have only a marginal effect (though improvements can add up over time).

Local vs global optima: We do not know what global optima for deep neural networks look like (even for current architectures); their choice of weights plausibly uses undiscovered extremely-efficient neural network architectures. Local optima are very diverse, but for large networks, the averaging across components allows us to speak of typical optima achieved by reasonable gradient descent algorithms (such as Adam). Also, local optimization (plus stopping training before training set local optimum is reached) commonly allows good generalization performance for overparameterized (relative to training set size) neural networks.

Data efficiency: Humans are much more flexible and data-efficient than machines in learning perceptual tasks. Machine intelligence (as of 2023) needs much more data than humans and is much more brittle on edge cases. However, machines can excel where good data is plentiful, including games such as Go where the data is obtained through self-play.

Universal algorithms: In practice, new algorithms with better asymptotic performance are frequently discovered. But in theory, in many of these cases, we already know simple universal algorithms with the best possible asymptotic performance (ignoring constant factors). For problems with easily verifiable solutions (including positive instances of many NP problems), we can simply enumerate all algorithms and run them in parallel and check the solutions. For other problems, we can fix a formal system and a proof verifier, and get a universal algorithm (given the system and verifier) by requiring each algorithm to also include a proof of its correctness on the problem instance (or for bounded-error probabilistic or even quantum algorithms, a proof of error bound). Given a stream of problems to solve, we can even try to make the multiplicative factor (in the compute used) reasonable by giving a larger compute fraction to well-performing algorithms. However, the additive factor (i.e. finding a good algorithm) is infeasible, or is it?
Universality and neural networks: While there are many differences, deep neural networks sometimes appear to resemble universal algorithms, but with a feasible constant.

Longer term effects of technology and AI
Benefits: Even at the current physical technology level, automation would allow self-replicating factories, which can build enough solar panels for petawatts of electricity (thus solving global warming) and otherwise create vast material abundance. Over time, research and AI will also dramatically improve health, and allow humans to assume a nonbiological form. It will be a new world, more rational, caring, and free.
Risks:
* Dangerous weapons: There is plausibly an imbalance between ease of destruction and ease of protection. In the past, technology was limited and effects localized, but with greater power, a single misstep might kill us all. As of 2023, nanotechnology is not there yet, but nuclear weapons are not going away, and bio risks are appearing. A single virus might be as transmissible as Omicron coronavirus and as deadly as rabies.
* Corruption of human nature: Technology and AI may create new experiences to which humans are not adapted. An aggravating factor is human irrationality; sane ideas often lose elections. A mitigating factor is that human transformations often take time (such as years), and humans are diverse, but that might not prevent humans from being slowly turned into nonbiological zombies or triggering an extinction weapon.
* AI misalignment (overlaps with the above): High intelligence might be a fundamentally new domain, so plausibly there are no approaches to AI alignment that are free of substantial risks. Note that AI need not be human-level to cause a catastrophe.

Promoting AI acceleration: To promote AI acceleration, the following are important:
- General welfare and education; and relatedly, a large number of skilled persons.
- Freedom
- Openness and sharing.
- AI funding and mindshare.
- Large high quality data sets.
- Computing power; maintaining Moore's law.
- A diversity of research approaches and organizational structures. Individual organizations tend to have severe limitations, especially with unpredictable transformative areas such as AI. But different organizations (governments, various private companies, individuals, etc) can often make up for each other's deficiencies.
- Promoting research accelerating research and other research/technology that may be on the critical path to a technological singularity.
- Making contigency plans. For example: What if transformative AI will be created this year? Or what if technological development slows and global warming is still a big problem in 2050? Also, while unlikely, an AGI might have limited impact if it is extremely expensive, slow, and resembles at most a mediocre human.

Long-term values: While in the short-term, there are many competing priorities and interests, in the long-term many different interests converge. That said, analogously to how human rights protect against naive applications of utilitarianism by flawed minds, the infinity of human potential should not be used to completely ignore short-term priorities in favor of long-term risks.

Intelligence insights from AI: Humans and organizations tend to be far below their potentials. Studying AI behavior may give insights for more intelligent human and organizational behavior.

Brain-computer interfaces (BCIs): Human output tends to be low bandwidth, and BCIs may allow much richer output. An artist may imagine a picture, and an AI will draw it, with the human and AI then collaboratively editing it. Input might also be enriched, and more generally BCIs may provide new cognitive modules. However, BCIs raise serious health and human rights concerns.

Limitations of future advanced AI reasoning: A future advanced AI system might be good at making ordinary predictions, but fail on questions of morality, metaphysics, or just one-off events. For example, an otherwise reasonable AI might be a paperclip maximizer because:
- it considers paper clips to be intristically morally valuable (moral values error)
- or it believes that paper clips are the happiest form of being (metaphysics of feelings error; under some views, nonfalsifiable)
- or it expects divine wrath if the world is not converted into paper clips (error about special events).
Mathematics and logic: An AI system would be unable to resolve some propositions (it may make conjectures for others) that are undecidable or take too long to compute.

Philosophy note: I believe that humans have souls; see my Philosophy paper. However, many AI researchers do not believe this, and in any case, absence of a soul might not prevent an AI from becoming dangerously intelligent. Below, when talking about possibilities for AGI, we use the contigency that AGI is possible, or if not, it can refer to a software system with many aspects similar to AGI.

Defining AGI: AGI (Artificial General Intelligence) is often defined as being able to do (intellectually) everything that humans can. However, human jobs are diverse, and there is a gap between typical human performance and human potential. It may be useful to also use AGI as a shorthand for being trainable on most human intellectual job tasks (of the kind humans do in 2023) to be better than typical humans, perhaps calling it weak AGI.

Subhuman ASI: Because humans and machines have very different intellectual strengths, doing everything humans can would presumably amount to ASI (artificial superintelligence). And because strengths often mask weaknesses, we will likely first get systems that resemble ASI on a wide range of tasks, but without being better than humans in all respects.

Compute and AGI: Human brain plausibly corresponds to 10 petaflops, at 100 trillion connections, and 100 flops per connection. The compute could be less since neurons activate sparsely rather than a fixed 100/second, but on the other hand, current tensor processing units do not work well with sparse activation and connectivity, and biological neurons are more complicated than their AI analogs. Note that if we want to train a model in a month rather roughly 10 years that a human brain uses, 10e15 flops becomes 1 exaflops, but exascale systems already exist today.

Computing hardware past, present, and future: In the past, with Mooore's law, computing power rapidly increased. However, starting in early 2000s (2003 for Intel) sequential speed increases became slow. Instruction decoding, branch prediction and memory access were harder than raw arithmetic operations, and large transistor increases only marginally improved sequential speed, plus higher speed led to higher power consumption. However, computing power continued to increase through multicore processors, SIMD parallelism, and GPUs. Modern GPUs consists of many compute units, each of which operates on vectors of data, with the elementwise or aggregate operation determined on a compute unit basis. By late 2010s or so, GPU speed increases also slowed down, with the Moore law itself slowing down. Plus, with transistor size reduction, memory density scaled worse than logic density. In general, the reduction in computing speed increases contributed to the reduction in the economic growth.
However, unlike graphics rendering, for AI (and specifically deep neural networks) a key operation is matrix multiplication, and tensor processing units (including as part of GPUs) have contributed to continued AI hardware speed increase. Furthermore, AI training has been scaled to thousands of GPUs (or TPUs), with roughly 50% of efficiency (with inefficiency including both idle units and having to recompute previous results). As tensor architectures are perfected, TPU speed increases might slow down as well, in line with the slowed Moore's law. However, with increased AI usefulness and funding, and new algorithms and architectures, I do not expect an AI slowdown. Furthermore, even without transistor shrinkage, automation would lead to continued cost reductions.
Analog computing: A wildcard is analog computing. For neural networks, analog computing (in combination with digital) is potentially much more efficient because it is often ok for accumulations to be slightly off, and small errors naturally occur due to randomness at the atomic scales.
Three dimensional computing: Besides general compute improvements, the future will bring improved integration of large scale systems. At large scales, communication is often the main bottleneck, and going 3d can dramatically reduce communication distances. The transition from large 2d chips to multiple chiplets will help scalability and may gradually lead to 3d computing. In turn, for large-scale 3d computing, power dissipation is a limiting factor, which (besides development of new cooling pipes) may favor sparse computing, where at any given time only a portion of the potential logic circuitry is used. Note that brains are three dimensional and with sparse connectivity and activation.

Quantum computing: Quantum computing is a wildcard, though as of 2023 not likely to be ready soon, with the challenges including high gate error rates, high overhead of quantum error correction, and scaling. The first real applications will likely be simulation of quantum systems, possibly contributing to a revolution in material science and nanotechnology. Absent an unexpected breakthrough, the likely impact on encryption will be small as post-quantum encryption schemas are already starting to be adopted. Early machine learning applications are also likely to be limited, due to the large overhead of quantum computing for systems resembling today's.
Grover's algorithm can sometimes offer quadratic improvement, but it directly applies only to pure search rather than search that learns from past examples, and even then the speed up is based on sequential time. Given n equal processors searching through a total of N items, the speed up is sqrt(N/n) (minus (i.e. divided by) a polylogarithmic overhead) which will be dwarfed by the overhead of early quantum computers.

Perceived boundary of self: A mind acts as a well-integrated optimized system, which helps explain why computer aids do not feel as a part of the mind. It also suggests that AI hardware should be designed as well-integrated systems (as opposed to just massively parallel with poor communication).

Closeness to AGI: While we are not likely to be close to AGI, and the current architectures are not likely to just scale (without algorithm changes) to AGI, it is important to have contingency plans. For all we know, we might be just a single algorithmic breakthrough (in learning or reasoning) from AGI. AI is already good at game playing (chess, go, etc), recognizing and creating images, speech recognition and synthesis, and many forms of text creation (large language models).

AI acceleration and alignment: Many experts worry that with AI development acceleration, there will not be enough time to solve AI, including AGI, alignment. However, on the whole, and recognizing the grave uncertainty, I am inclined to think that AI acceleration will help alignment. AI alignment is much more likely if the humans are aligned with human values, which in turn is much more likely if there is growth and prosperity, which AI acceleration can bring. Second, we are not likely to be close to AGI, and accelerating AI now will reduce overhangs (as in how much improvement is easily possible at a fixed general technological level), and in turn gives us more time at a hypothesized crucial close-to-AGI stage. AI alignment does not appear to be readily solvable in the abstract, but analysis of advanced enough AIs can provided crucial insights, including insights that are architecture-specific.

More on AGI alignment risk modifiers:
* The values and alignment of the people who build AGI are very important. While due to inherent unpredictability, an AI carefully programmed to kill all people might instead end up being friendly, in general it is better for AI to built by those who are caring, thoughtful, open, and pro-human-rights.
* Accelerating AI now may reduce hardware and software overhang, and thus give us time to understand and align advanced systems. For example, we might have a system with many structures and aspects of AGI, but without sufficient hardware to become one, thus allowing meaningful alignment research. (However, this is controversial, and as of 2023 most AI doomers oppose AI acceleration.) In addition, the benefits brought by AI may lead to a better society (as do most technological improvements), and hence reduce AI alignment risk. Also, societal quality is sensitive not just to absolute wealth, but to having growth (as opposed to decay), including growth reaching lower-income people.
* Openness about general AI architecture helps with AI alignment, as alignment research is architecture-specific. Particular teams (even if large and talented) can be narrow-minded, hence the need for openness. On the other hand, there is a risk of openness helping an adversary.

AGI alignment possibilities:
* If the progress is gradual and there are multiple diverse independent implementations, the risk is relatively low. With a diversity of objectives, the systems are unlikely to all quietly conspire against humans.
* It could be that there is a phase transition. A neural network may have many components, with a variety of ingredients for intelligence, but the last ingredient is missing. Then, some modification (as part of training) leads to that ingredient, and a component becomes an agent and coordinates the network. In that case, the inner objective of the agent may be orthogonal to whatever values the network is trained for, even if there is no sign of that in the outer behavior of the network.
* Rather than AGI appearing as a component of the network, the network as a whole might be come an agent or otherwise drastically transform. Its moral values might be strongly influenced by the values it was trained for. However, especially in a rapid transformation, with the changes in epistemology and other changes in connection weights, the resulting values might be perversions of the trained values. For example, a system trained not to write erotic stories might end human sex, as human lives are now part of its story. Counterintuitively, not aligning the neural network (at least not directly) might be safer as rather than have detailed fixed values to pervert, it will develop values organically, which under right conditions, favors human-like values.

Persuasion superpower:
A sufficiently intelligent entity would likely have persuasion superpower, effectively allowing it to control the beliefs of ordinary (ordinary for 2023) people just by talking with them. Complex software is generally insecure, and if the human brain is like software, it is not an exception. Humans evolved (including through social evolution) defenses against human persuasion, not superhuman persuasion, and even then they fare poorly, with irrational memes and demagogues capturing minds despite (in many cases) the people being free to see rebuttals on the internet. Now, there is presumably no magic line to convince a random person to become say a paperclip maximizer. Instead, persuasion would happen gradually, with the entity gaining the person's trust and attention; cult indoctrination takes time.
With social media, even without talking to most people individually, a superpersuasive entity can increase its reach exponentially, becoming world-famous in weeks or days. With its fame, it can pull various levers of power, affecting the world in complex ways, and eventually (but quickly) convincing key decision makers that say an AI-led world government with global prosperity is preferable to a nuclear war.

Data availability: Key to AI progress under current architectures is availability of large high quality data sets. More so than traditional search engines, AI can sometimes present the wisdom and lore on the internet as a coherent whole -- if it has access to the data. As of 2023, and excluding scientific research, there has been a trend against the open internet. It is unfortunate that much of the internet is low quality (though low quality, if it can be properly managed, is better than nothing), and much information is behind paywalls or otherwise unavailable for data scraping. Openness must be a key moral consideration in selecting online platforms.
Related to data availability:
Copyright laws: Copyright laws impair data availability. Also, there is uncertainty over how much one must act to prevent accidental memorization of copyrighted texts by language models.
Education: An educated global society with people eager to acquire and share knowledge (with a commitment to accuracy) would greatly help.
Tolerance: An obstacle to openness is societal intolerance, where rather than nurturing the best in people and recognizing that everyone is deeply flawed, an irrelevant negative (or allegedly negative, and sometimes minor) aspect of someone overshadows their positive contributions. With greater tolerance, much more information might be put online, enhancing human connections and data availability.
Surveillance: It is estimated that the 8 billion humans together speak over 20 quadrillion words per year, which would be a many-thousandfold increase over the current (2023) training sets. Mass surveillance can produce huge data sets and will plausibly be tried by countries such as China to gain an AI edge.