Methods: the greeNsort® aproach

The greeNsort^® project follows a pragmatic scientific approach, from focused R&D over free publication to low-cost implementation

Research & Development

During the past 20 years, histories and expositions of mathematics for general readers have gotten dramatically better, while the analogous histories and expositions of computer science have gone downhill. — Donald Knuth¹

Science shall be systematic and unbiased, but trends and biases threaten objectivity, namely replication bias and publication bias. What testing is for IT, is replication for science: it’s core to the method but rarely done because there are little incentives. greeNsort^® is an extreme replication project: greeNsort^® went back to the roots of digital sorting, starting with von Neumann’s Mergesort (1945) and Hoare’s Quicksort (1961). greeNsort^® analyzed trends in the history of sorting research and questioned important research results. greeNsort^® reviewed core sorting literature, identified limiting beliefs and focused on unexplored parts of solution space.

Core elements of the greeNsort^® method are:

define sustainable evaluation criteria (rather than just measure speed)²
enhance the definition of sorting and search in an enhanced solution space (rather then search where others have searched)³
focus on generic sorting (avoid specialized algorithms such as radix)⁴
focus on simple pragmatic methods (avoid rather academic improvements such as stable inplace sorting)⁵
choose a robust machine model and save effort of mathematical proofs (rather than apply a pseudo-precise but unrealistic machine model)⁶
iterate often design, implementation and measurement (rather than putting time into academic publication)⁷
learn history⁸, doubt history, try by yourself⁹
try hard and test harder¹⁰

Using this approach, greeNsort^® developed sustainable algorithms, see results and portfolio

Liberating the Intellectual Property

The value of an idea lies in the using of it — Thomas Edison

In order to realize it’s full savings potential, the greeNsort^® algorithms must be freely available. Usual patent licensing would block broad adoption and making a patent freely available is too expensive¹¹. Instead of patenting we published our new algorithms, here, on arXiv.org and github and donate the know-how to the public.

Dissemination

computing professionals should promote environmental sustainability both locally and globally — ACM Code of Ethics

The Association for Computing Machinery (ACM) is a is a US-based international professional association for computing. With its motto “Advancing Computing as a Science & Profession” it runs journals for scientists and practitioners, hence ideal for disseminating new scientific results relevant for practice. We developed a paper that can serve as the basis for an article in the practitioner journal acmqueue and/or in ACMs flagship journal Communications of the ACM which addresses the most broad audience of scientists and practitioners.

Implementation I: Plug-in rollout

good interfaces are the essence of good code — Bjarne Stroustrup

greeNsort^® algorithms can be quickly adopted by plugging them into the standard sorting APIs of the most important programming languages, because this ´s no changes to software calling these APIs. The TIOBE index ranks programming languages for popularity. The following is a list of TIOBEs ten most important languages (July 2020) plus another five that are important as systems programming languages or for large-scale numerical processing:

Position	Language	Ratings%	Cumulated%
1	C	16.45	16.45
2	Java	15.10	31.55
3	Python	9.09	40.64
4	C++	6.21	46.85
5	C#	5.25	52.10
6	Visual Basic	5.23	57.33
7	JavaScript	2.48	59.81
8	R	2.41	62.22
9	PHP	1.90	64.12
10	Swift	1.43	65.55
12	Go	1.21	66.76
18	Rust	0.70	67.46
26	D	0.55	68.01
36	Julia	0.34	68.35
50	Fortran	0.22	68.57

They cover two thirds of the TIOBE index ratings, hence by improving their standard sorting libraries, it is possible to roll-out superior algorithms to a large number of software applications. One entry point for finding allies for getting this done is: ClimateAction.tech.

Implementation II: Optimal Interfaces

interfaces shouldn’t be unduly influenced by implementations — Kent Beck

Additional energy savings can be realized if sorting APIs are improved. Since improved APIs require changed in the calling code, the fruits of such optimizations can only be harvested in the long run. Some examples of improvement opportunities follow:

The sort API in the standard C library, which requires the user to submit a tri-boolean comparison function cmp(x,y), which returns -1 | 0 | +1 for x<y | x==y | x>y and hence requires two internal comparisons. A typical use of cmp is a single comparison as in if cmp(x,y) > 0 {} else {} which now costs two comparisons instead of one.
The sort API in R asks the user to specify the sorting method as one of 'shell' | 'quick' | 'radix' . Asking the API for a specific algorithm prevents the API to use a more efficient algorithm where applicable. Hence it is better to specify requirements such as need_stable in TRUE | FALSE and let the API choose the best algorithm matching the requirements.

Implementation III: teaching

The cheapest, fastest, and most reliable components are those that aren’t there — Gordon Bell

In some cases, writing custom algorithms has advantages over re-using libraries. Hence software engineers need to know algorithms. Hence we need to teach them.

If you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas. — George Bernard Shaw

in a letter to Martin Campbell-Kelly cited in Let’s Not Dumb Down the History of Computer Science. In his 2014 Kailath lecture Knuth explains why he, as a scientist, gets so much out of reading the history of science:
1) To understand the process of discovery
2) To understand the process of failure
3) To celebrate the contributions of many cultures
4) Telling historical stories is the best way to teach
5) To learn how to cope with life
6) To become more familiar with the world, and to know how science fits into the overall history of mankind
Since Knuth is a kind person, he did not add points 7 and 8, let me do it for him
7) To identify trends that the scientific herd is following — in a possibly wrong direction
8) To correct wrong scientific decisions in the past that have led to suboptimal results today
↩︎
“First, solve the problem. Then, write the code.” ― John Johnson↩︎
“It came to appear that, between two truths of the real domain, the easiest and shortest path quite often passes through the complex domain” ― Paul Painlevé↩︎
“The key to performance is elegance, not battalions of special cases” — Jon Bentley and Doug McIlroy↩︎
“Simplifications have had a much greater long-range scientific impact than individual feats of ingenuity. …. Simplicity and elegance are unpopular because they require hard work and discipline to achieve and education to be appreciated.” ― Edsger W. Dijkstra↩︎
“It is better to be vaguely right than exactly wrong.” — Carveth Read↩︎
“In the fields of observation chance favors only the prepared mind” — Luis Pasteur↩︎
“Statistics from the IEEE and ACM digital libraries suggest that we rarely look at anything more than two or three years old … and miss ideas that have shaped 60 years of research”, David Alan Grier ↩︎
“The difficulty lies not so much in developing new ideas as in escaping from old ones.” — John Maynard Keynes↩︎
“Genius is one percent inspiration, and ninety-nine percent perspiration.” — Thomas Edison↩︎
Why do firms give away their patents for free? The technology providing firm, is characterized by making specific core patents freely available […] In serving society these companies argue that patenting and afterwards releasing the patent guarantees that the technology remains open.↩︎