Maximal Computing #
This section introduces you to “maximal computing,” which refers to any computationally intensive work, including working with blockchains, Big Data, and Machine Learning (ML) and other types of Artificial Intelligence (AI).
Key Recommendations #
- Develop a clear understanding of when it is appropriate to use AI and blockchain for a humanities research problem.
- When you do need to use maximal computing, consider using less-intensive processes like using smaller models and adapt existing infrastructures.
- Understand how to use efficiency metrics like T-scores, confusion matrices, accuracy, recall, Area under the ROC curve, LOGLOSS.
Problems of Proportionality #
Maximal computing systems are everywhere these days. For example, AI technologies have been applied to speech recognition, natural language processing, computer vision, and many other areas. In many parts of the world, AI is present within our homes, cars, mobile phones. New applications of AI are being implemented with the help of large publicly open datasets, available on the web (for example open crawl), which in turn are fed to models trained with huge hardware capabilities (training on GPUs and TPUs). Such maximal computing may help with adaptation to climate change; Gordon S. Blair gives the example of flooding resilience:
The challenges facing flood risk management practitioners are considerable as they make long-term decisions, e.g., about investments in flood defenses, with limited budgets. [...] Thanks to developments in digital technology, though, major changes are now anticipated, in particular related to the plethora of data becoming available (cf. big data)—from satellite imagery, from sensors deployed around catchments (cf. the Internet of Things), from detailed studies carried out by local authorities, from citizen science, and from mining data from the web.
Maximal computing is an important part of STEM research, but it is also growing within the arts and humanities (e.g. interest in generative models such as GPT-3 and Stable Diffusion). While it is well-recognised that global warming results from carbon and other greenhouse gases, broader AI communities have been slow to understand the relevance of our activities. As Knowles et. al (2021) note:
A serious and proportional response to the climate emergency would [...] involve constraining energy demand and mitigating drivers of infrastructure growth, and as a result, also consuming less energy. In real terms for computing, this means manufacturing fewer devices, storing and processing less data, generally managing with less compute power; and in terms of technical ambitions, scaling back the Internet of Things, resisting the temptation to throw AI and blockchain at every problem, and breaking free of the cycle of ever increasing demand for computation.
How do we decide whether a computational process is proportionate to the benefits it may create? In other words, when does it become appropriate to ’throw AI and blockchain’ at a problem? Can the underlying goals of computationally intensive work sometimes be met in different ways? And when we do decide to use computationally intensive processes, how can we optimise them from a climate and sustainability perspective?
Compute needed to train state-of-the-art models have risen exponentially since about 2012 (Source: OpenAI, https://openai.com/blog/ai-and-compute/.)
Some easy wins #
- Don’t use it at all. What are you trying to achieve? Can it be accomplished in another way?
- Don’t use it just for fun. Have fun of course, but also have a purpose in mind, one which would be sufficient even if it wasn’t fun.
- Use pre-trained models. There are so many to choose from! And likewise, try to make any models you do train available for others to use.
- Use smaller models. Larger models may yield better results, but often a smaller one will still do everything you need.
- Don’t overtrain your model. You don’t need to keep teaching the model something it learned 50 steps ago.
- Raise awareness of the issue. Normalise responsible use of compute.
- Explore doing it on-site. If your institution has High Performance Computing (HPC) facilities that you can access, there may be greater opportunities for collaboration and optimisation, compared to renting your compute off Amazon, Google or Microsoft. Shao et al. (2022) review some metrics for data centre energy efficiency.
- Support responsible use with UX design. If you are building something, that may mean making it deliberately less fascinating and immersive. In this space, user engagement in itself is not a measure of success. For more information on minimal UX design, see the “Minimal Computing” section) of this Toolkit.
Then it gets complicated #
The easy wins are not enough, and things quickly get complicated. Here are a few factors to take into account, when considering proportionality.
1. Can it be deferred?: Does this computational process really have to be done now? Could it be done in five years’ time, when (if all goes according to plan) much more energy will be generated from renewable sources? Of course, research is often about winning the race, but maybe that’s an attitude that needs to shift.
2. What is its carbon impact? Monitoring maximal computing may help to identify where use is suboptimal or disproportionate to what the job is trying to achieve. Where is HPC use monitored in your institution? What kind of data is collected, and who gets to see it? Maximal computing resources are often denominated in hours of core use. Can it instead be denominated in CO2e?
3. On what balance sheet(s) will the carbon impact appear? Will it be your own university or research institution, or some third party? Are you happy with the way the reporting entity measures and discloses its carbon impact?
4. What are the potential benefits? Clearly, the underlying purpose of a project has bearing on how we assess its proportionality.
5. Do these benefits relate to sustainability? If so, you might decide to be a bit more lenient. Is your goal to optimize planting patterns for beleaguered pollinators, or to call out greenwashing in corporate disclosures at scale, or to tailor communications relating to famine relief? Or will the outcomes be unrelated to sustainability, or even maladaptive?
6. What are the broader ethical questions? Perhaps (as occasionally seems to happen sometimes in AI research) your goals are pure evil, only you haven’t noticed yet? Bringing in ethical considerations in an intensive way, early in the process, may help to decide when maximal approaches are appropriate. The ethics of AI is a very rich field. Prominent themes include bias (e.g. gender, race, language, class, geography, disability), opacity (is AI explainable? If so, who explains and who listens?), and broader considerations of justice. The Critical Algorithm Studies reading list and Zotero Library contain further reading on the politics and ethics of AI and algorithms more broadly. The Data Hazards project is developing labels that seek to communicate the risks involved in Data Science approaches, from concerns about privacy to high environmental cost: ‘Considering worst case scenarios is one part of this puzzle. Worst case scenarios free us from trying to predict the future: we’re not saying that something will happen’ (Thurlby, Natalie and Di Cara, Nina 2021).
7. How likely is it to succeed? This is a question you can ask at multiple scales, e.g. a project, a model, a query.
8. Is there still optimization that could be done?: How lean was the software building process / is the software? Are there ways to improve or to green it (i.e. reduce resource consumption)? Can computationally intensive jobs be scheduled for when the sun is shining and the wind is blowing?
Focusing on Natural Language Processing, Strubell et al. (2019) recommend “a concerted effort by industry and academia to promote research of more computationally efficient algorithms, as well as hardware that requires less energy” as well as the development and promotion of “easy-to-use APIs implementing more efficient alternatives to brute-force grid search for hyperparameter tuning, e.g. random or Bayesian hyperparameter search techniques” (Strubell, Ganesh, and McCallum 2019)
CodeCarbon can be embedded in Python code, to estimate emissions based on location, and recommend compute regions with lower carbon intensity for major cloud providers (AWS, Azure, and GCP).
Xu et al. (2021) offer a survey of deep learning optimisation techniques, which they categorise into compact networks, energy-efficient training strategies, energy-efficient inference approaches, and energy-efficient data usage.
9. What about openness? Are you using maximal computing in ways that can benefit other researchers and creators?
Perhaps it’s time for some new types of open licenses, that aim to influence the carbon impact of derivative software and applications? We are of course deeply in favour of open practice. Nonetheless, even here there may be negative considerations too. In some cases a project may be justified by its proportionate use of maximal computing, yet if shared inappropriately, will predictably result in unjustifiable variants.
10. What affordances are you creating?: If you are building something, does it encourage or enforce responsible use? Does your choice architecture and UI seek to maximise user engagement, or does it seek to encourage users to be careful and reflective?
Measures of Efficiency #
Machine Learning models are can be measured using metrics like T-scores, confusion matrices, accuracy, recall, Area under the ROC curve, LOGLOSS (Minaee 2019). It is becoming more common to see the literature report on the electricity and environmental consumption required for advanced maximal computational research. Some useful environmentally relevant metrics to consider when planning your AI research projects are:
- Floating point operations (FPO), in FLOPS, GigaFlops
- Processor utilisation (%), % use of CPU/GPU/TPU
- Electricity consumption, in Watt Hours (Wh)
Lacoste, Luccioni and Schmidt have been researching environmental impacts of AI and ML, and have developed an online open source Machine Learning CO2 impact calculator. This is an important tool to help our understanding of the emissions of ML based humanities research.
Red AI and Green AI #
“The vital first step toward more equitable and green AI is the clear and transparent reporting of electricity consumption, carbon emissions, and cost. You can’t improve what you can’t measure.”
Jesse Dodge, Allen Institute for AI, coauthor of Green AI.
In 2020, the Association for Computing Machinery (ACM) suggested that to measure efficiency of AI models we need to report on the amount of work required. This includes the work to train the model, tune the hyperparameters and retrain the model repeatedly in however many iterations you use. Understanding the costs for a single document, the size of the data set and the steps in your pipeline (such as preprocessing, cleaning, and enriching) helps you to comprehend the total work required, and thus allows you to optimise at specific points in your pipeline. Red AI refers to the dominant approach, that seeks to improve results through massive computational power without regard to environmental impact. Striving to “treat efficiency as a primary evaluation criterion alongside accuracy” is what Schwartz et al. call Green AI.
Mini case study #
GPT-n workshops case study #
In some recent workshops to introduce participants to GPT-n text generators, participants fine-tuned a pre-trained model using texts of their choosing.
There were good pedagogic reasons to let everyone choose their own input text, e.g. from Gutenberg, for the fine tuning: it turns the activity into a more exciting experiment. It also made sense for me to do the activity myself, because (1) it's easy to forget details when you're describing something from memory and notes, and (2) I know from experience that certain fine-tuning texts will "work well," so we would have at least one interesting output to discuss at the end.
But in the short time available, participants often chose arbitrary and / or "obvious" input texts (e.g. Shakespeare, or an author they were primed to choose because they had been mentioned earlier in the workshop). Likewise, pushed for prep time, the workshop leader realised they were tending to use the same input texts again and again. Furthermore, we would generate hundreds of pages of text, and then only browse the first few - and never use the rest for anything.
My new rule of thumb: if I am running a computationally intensive process for educational or demo purposes, do so in a way that doubles as research (in a loose sense of "research").
For the latest iteration of the workshop we have:
- Explicitly included discussion of carbon costs.
- Used a recording of myself doing the fine-tuning and text generation.
- Created a shared folder of potential fine-tuning inputs which I am interested in for various reasons - participants could choose one of these or pick their own.
- Built in time to discuss actual immediate use cases of text generation, and what questions we might ask of the output (and how many pages we needed to generate).
In some ways the workshop felt clunkier, and I had to cut some content which I liked. But it still felt worth it to include very basic sustainability considerations.
If I run similar workshops in the future I hope to:
- Find out more about the actual carbon footprint of these processes.
- Explore a format where small groups collaborate, running just one fine-tuning per group.
- Explore choosing from a variety of already fine-tuned models, rather than running new fine-tunings.
- Seek out others who may already want synthetic texts based on specific fine-tuning inputs, and offer to do this for them as part of the workshop.
- Continue to think about proportionality, and if the carbon cost does seem too high, then replace the workshop with something else entirely.
I don't think that the energy savings will be extensive. When I think of all the users of art AIs, generating thousands and thousands of images just out of curiosity or a playful compulsion, it feels like a drop in the ocean. But it also feels important to model responsible behaviour. A minimal set of principles for using AI to generate text or art might involve:
- Spend the time to articulate what I am trying to do and why (not just “to try it out” or “to see what happens”).
- Whenever possible, combine different purposes (try out an experimental approach in a way that may also contribute to a particular project, etc.).
- Whenever possible, work in an open and shared way, so that others can benefit from my use of the AI.
Adapting existing infrastructures #
Large organisations such as Higher Education Institutes (HEIs) or other public sector organisations require complex IT infrastructure to satisfy their diverse stakeholder requirements. For example, a university will typically have a central IT service offered out across different academic departments but some departments will also run their own more specialised services. Service catalogues provide a way for organisations to describe what technologies and services they support and could be a good starting point to communicate their environmental footprints. This may then be used to raise awareness or feed into data management plans when describing the infrastructure requirements for a new research project. A simple colour-coded gradient scheme could be used to help categorise the technologies and services similar to the ones you see on electrical appliances (or other schemes).
As more organisations shift their infrastructure from always on Virtual Machines (VMs) to more on-demand compute services we should start to see benefits in energy consumption, especially in large organisations that have budgeted for VMs that might run with minimal use for periods of time. We will move to a model that is more elastic and can scale horizontally to accommodate changes in demand. Cloud providers such as Amazon have started to provide their own carbon footprint calculators.
We are also seeing a shift towards using Content Delivery Networks (CDNs) which can provide in-memory caches of popular resources avoiding the requirement to do expensive transfers across backend systems. Understanding the computation or energy benefits from these caches would provide useful insight.
Within software development projects it is typical to see CI/CD pipelines being deployed. We need to raise awareness of the impacts of continually triggering what might be a resource intensive process.
Further reading #
Blair, Gordon S. ‘A Tale of Two Cities: Reflections on Digital Technology and the Natural Environment’. Patterns 1, no. 5 (14 August 2020). https://doi.org/10.1016/j.patter.2020.100068.
Knowles, Bran, Kelly Widdicks, Gordon Blair, Mike Berners-Lee, and Adrian Friday. ‘Our House Is On Fire:The Climate Emergency and Computing’s Responsibility’. Communications of the ACM, 2 December 2021. https://eprints.lancs.ac.uk/id/eprint/162995/.
Schwartz, Roy, Jesse Dodge, Noah A. Smith, and Oren Etzioni. ‘Green AI’. Communications of the ACM 63, no. 12 (17 November 2020): 54–63.
Shervin Minaee, 20 Popular Machine Learning Metrics. Towards Data Science, 28 Oct 2019. [Accessed 2022-04-06].
Shao, Xiaotong, Zhongbin Zhang, Ping Song, Yanzhen Feng, and Xiaolin Wang. 2022. ‘A Review of Energy Efficiency Evaluation Metrics for Data Centers’. Energy and Buildings 271 (September): 112308. https://doi.org/10.1016/j.enbuild.2022.112308.
Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. [‘Energy and Policy Considerations for Deep Learning in NLP’].(https://doi.org/10.18653/v1/P19-1355) In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645–50. Florence, Italy: Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1355.
Xu, Jingjing, Wangchunshu Zhou, Zhiyi Fu, Hao Zhou, and Lei Li. 2021. ‘A Survey on Green Deep Learning’. https://doi.org/10.48550/ARXIV.2111.05193.