Getting the maximum value of an array in JavaScript, and other thoughts
📅
The other day I was watching “Think Like a Computer Science Professor” in freeCodeCamp’s YouTube channel. It’s a recording of Radu Mariescu-Istodor, PhD in computer science, coding in JavaScript an animated avatar with face tracking. Dr. Mariescu-Istodor seems to be very knowledgeable and is a great communicator. Seeing both the process and the outome, I can only recommend watching it.
Having said that, we have the benefit of having had Dr. Mariescu-Istodor do all the heavy lifting and we are free to learn not only from what has been done, but also from what could have been done differently and, ideally, better. This whole article is nitpicking about something that, by all accounts, has no impact on the project and its learning value. On a “real” project, I would also not optimise my code if the benefits were not very apparent beforehand, and/or more important objectives had been met (usually, delivering a working version of the software).
The original code
The code I am going to comment on appears at around the 9 hours and 3 minutes mark:
function getConstelation(locs){
const pointWithMaxY=locs.find(p=>p[1]==Math.max(...locs.map(l=>l[1])));
// Rest of the code ommited
}
This line in the function finds the first element in an array of arrays, each one containing two numeric values representing x
and y
coordinates, that contains the maximum y
value of the whole set.
Let’s see how and why it can be improved.
Using Math.max.apply( null, array )
instead of Math.max( ...array )
The spread syntax is compact, readable, and reasonably efficient. However, in this case we can get the same result using the apply()
method. It requires no more specific knowledge of the language than the spread syntax and, although it is at least eight characters longer, has one clear advantage when we time both options:
( function( ) {
// Create an array and fill it with 100,001 random values.
const values = [];
for ( let i = 1e5; i >= 0; i = i - 1 ) {
values[ i ] = Math.random();
}
console.time( "apply" );
Math.max.apply( null, values );
console.timeEnd( "apply" );
console.time( "spread" );
Math.max( ...values );
console.timeEnd( "spread" );
} )( );
The results are variable between executions but on my machine I usally get a 2ms difference:
apply: 1ms - timer ended
spread: 3ms - timer ended
It seems consistent if we change the order of the tests. I would hazard a guess that it is due to the number of iterations each line of code requires: Math.max.apply( null, values )
iterates once over values
, whilst Math.max( ...values )
requires two iterations, one to spread the array and another to apply the Math.max
function.
I am sure that the time loss is irrelevant under certain conditions, but there are no apparent disadvantages to using apply()
instead of spread syntax. Also, the getConstelation
method happens to be called inside a requestAnimationFrame()
loop, so on a browser with a 60fps refresh rate it will be called sixty times. Although losses (and benefits) must be properly measured and verified before and after optimising, we can reasonably assume that I stand to save around 12% in frames per second: 60fps × 2ms / 1000ms
.
Pre-caching values and improving the code’s legibility
The maximum y
value can be determined prior to using it to iterate over the the array to find the first point with said value. This has an improvement on the code’s efficiency but the better reason is that it becomes more readable, or at least more sequential.
The purpose of the original getConstelation
method has a larger scope than finding out said value, so let’s define a function for this concrete purpose:
function getFirstPointWithMaxY( points ) {
const yValues = points.map( point => point[ 1 ] );
const maxYValue = Math.max.apply( null, yValues );
const pointWithMaxY = points.find( point => ( point[ 1 ] == maxYValue ) );
return pointWithMaxY;
}
I prefer using mostly semantic variable names (e.g., point
instead of p
) and only common abbreviations (e.g., maxYValue
instead of maxYVal
, or points
instead of pts
), not unlike the naming conventions in WordPress’ JavaScript Coding Standards, for example. All other benefits aside, I find that it helps me by reducing the documentation effort.
Most JavaScript code can be minified anyway so I try to prioritise legibility. Besides, this format is better suited to help identify the following improvement.
Reducing the number of iterations
The previous code iterates three times basically over the same amount of data: once over points
, another time over an array derived from and with the same number of elements as points
, and once more over points
. The same results can be obtained with a single loop:
function getFirstPointWithMaxY( points ) {
let pointWithMaxY = undefined;
for ( let p = points.length - 1; p >= 0; p = p - 1 ) {
if (
( pointWithMaxY == undefined )
|| ( pointWithMaxY[ 1 ] <= points[ p ][ 1 ] )
) {
pointWithMaxY = points[ p ];
}
}
return pointWithMaxY;
}
Following a conventional analysis and notation, this method has less space complexity, using pointWithMaxY
(O(1)
) instead of yValues
(O(n)
). The time complexity is still O(n)
but there is only one loop instead of three (or four, if we use spread syntax), so we can assume there is a time save.
Measuring
We have to verify that there is an improvement. Barring measuring the performance in production, we can establish a metric. Timing is simple and a reasonable indicator (with many caveats, but that’s a matter for another article) so we can do that again.
So, let’s say the version with map()
, Math.max.apply()
, and find()
is called iteratorsFunction
, and the version with the for
loop is forLoopFunction
. We create array of 100,000 points with random values and both methods are called. This is repeated on multiple occasions, switching the order of the function calls from time to time.
Measured times in Chrome 105.0.5195.102 and Edge 105.0.1343.27 varied quite a bit between executions but, regardless of the values, the for
loop was consistently faster. For example:
iteratorsFunction: 12.810791015625 ms
forLoopFunction: 1.883056640625 ms
However, the same test in Firefox 104.0.2 produced something unexpected:
iteratorsFunction: 3ms - timer ended
forLoopFunction: 8ms - timer ended
These measurements did not vary much, with the for
loop always clocking more time.
A good thing about the for
loop is that it is easy to analyse and alter, with something as crude as a “what if” approach. It did not take long to find out what (although not why) it was making it slower: ( pointWithMaxY == undefined )
. Changing it for ( pointWithMaxY == void 0 )
or simply !pointWithMaxY
produced less surprising results:
iteratorsFunction: 3ms - timer ended
forLoopFunction: 1ms - timer ended
This change did not alter the results in the other tested browsers.
Additional benefit
The original code relies on methods and syntax that are specific to JavaScript. Even though they are not that different to other languages, some details might not be self-evident. For example, although it can be surmised from the context, someone might not know that “the find()
method returns the first element in the provided array that satisfies the provided testing function.”
The proposed code uses a generic for
loop and requires much less language-specific knowledge to understand. Granted, it is less compact but, when the trade-off is better performance and readability, I find it hard to argue against refactoring.
At the end of the day, it is a matter of balance. Some times performance is key and there is the obligation of making efficient code at the expense of legibility. Other times, we can incur on penalties for the sake of a leaner (meaning “having less text”) code base. For example, instead of creating an array and using a for
loop to fill it with random values, I could have written:
const values = Array( 1e5 ).fill().map( Math.random );
It is pretty straightforward and does produce an array of 100,000 random values, albeit being slower.
A couple more things
Documenting
Comment the code, to a reasonable degree, even if it seems verbose or self-describing enough. In this case, I would opt for JSDoc:
/**
* Get the first point with the maximum y value of the array.
* @param {number[][]} points An array of points. Each point is also an
* array with at least two values, the second one
* being the value of the y coordinate.
* @returns {number[]} The first point with the maximum y value of the array.
* `undefined` if it could not be found.
*/
function getFirstPointWithMaxY( points ) {
let pointWithMaxY = void 0; // Same as `undefined`, but faster.
for ( let p = points.length - 1; p >= 0; p = p - 1 ) {
if (
!pointWithMaxY
|| ( pointWithMaxY[ 1 ] <= points[ p ][ 1 ] )
) {
pointWithMaxY = points[ p ];
}
}
return pointWithMaxY;
}
This has a few benefits, both immediate and potential:
It forces to think and define how a certain code is expected to behave, even if the code is not there yet.
It may help reveal bugs, enabling to determine when there is a disconnect between what some code should do and what it actually does. There is a bit more on this on the next section.
Some editors parse these comments and show contextual hints with the information in them. Not having to go see what some function does, or what each argument is supposed to be, for example, saves time.
It helps maintain the code. Someone might have to build upon it or modify it in the future. That someone might even be me, which is very likely if I programmed it in the first place, and I know future me will not have all the details of the project present and fresh in his mind.
Testing
I would not set up a test suite just for this function. Nevertheless, it is a good place as any to start testing. Since we are dealing with a very isolated portion of code, the obvious starting point is unit testing. Having documented what the code should do, it is easier to define tests.
There are quite a few unit testing frameworks for JavaScript that facilitate this task. For example, using Jest:
test(
"Returns `undefined` for an empty array",
() => {
expect( getFirstPointWithMaxY( [] ) ).toBe( undefined );
}
);
test(
"Finds point with max y in an array with only one coincidence",
() => {
const points = [ [ 0, 0 ], [ 1, 1 ], [ 2, 2 ], [ 3, 3 ], [ 4, 4 ] ];
expect( getFirstPointWithMaxY( points ) ).toStrictEqual( [ 4, 4 ] );
}
);
test(
"Finds first point with max y in an array with two coincidences",
() => {
const points = [ [ 0, 4 ], [ 1, 1 ], [ 2, 2 ], [ 3, 3 ], [ 4, 4 ] ];
expect( getFirstPointWithMaxY( points ) ).toStrictEqual( [ 0, 4 ] );
}
);
// More tests...
The relevant output ought to be akin to:
PASS ./getFirstPointWithMaxY.test.js
√ Returns `undefined` for an empty array (1 ms)
√ Finds point with max y in an array with only one coincidence
√ Finds first point with max y in an array with two coincidences
My experiences with unit testing have probably skewed what priority it would have relative to other chores, but I would be remiss if I ignored it.
In closing
Keep it simple. Document. Test. Measure.
Do it all judiciously: these are not objectives, they are strategies and tools to help deliver software.
And check out Radu Mariescu-Istodor’s YouTube channel. It is full of entertaining content.