I been using node.js for a while now, and I do like it. I think I am still to make up my mind as to I consider this a solid production environment or it’s just a great tool for scripting and getting prototype off the ground — but even so, I definitely see its place in today’s software engineering ecosystem.
One of the things that I love about the node.js ecosystem is that most of the small (annoying?) things that I need are already done — in small single npm modules. (Arguably the same can be said about Java, however, with the Java community I found out that more often than not, the solution to these small problems are quite often over-engineered; with node.js peeps they seem to thrive on these little scripts which put together a lot of these modules.
However, I have just discovered recently a really big annoyance with the node.js ecosystem — and that is when it comes to file downloading!
Here’s the scenario I have: based on some complex variables and parameters I’m constructing an equally complex URL which hits one of our graphing systems in Netflix — this URL generates a PNG image. So I want to download this image and then process it — what I mean by “process it” is really irrelevant for the purpose of this discussion, but what is important is the fact that I want to download the response of this URL and save it locally as a PNG file (which is to say that I just need to write the response bytes in a
.png file on the local system).
And this is where my issue arrives: because all the npm modules I have found for downloading files have the same issue, which is they build the name of the file on the file system (the destination file) based on the input URL! By the way I’m talking here about the npm modules which allow me to do this via a one-liner something like
download(url, destinationFile) — I am aware I can write my own damn HTTP downloader by implementing lower level methods which make the HTTP call and save the bytes etc.
Can you see the problem here? If not I can tell you there are 2 (at least!):
- If you base your download file on the full URL then you end up with loooong filenames and characters rejected by the file system:
http://liviutudor.com/query?one=two&three=four&five=six&seven=eight&nine=ten....&one_hundred=one_hundred_and_onewill generate the filename to be :
query?one=two&three=four&five=six&seven=eight&nine=ten....&one_hundred=one_hundred_and_onewhich gets rejected by most file systems because of the length and the illegal characters in it.
- On the other hand if you base it just on the base name, then both
http://liviutudor.com/query?x=ygenerate the same
So what I was looking for was just a way to specify the input URL, and the destination file — not just folder. You think I could find a module to do that easily? Nope!
The nearest I could get to was url-download which allows you to specify just an output directory — so in the end I decided to send a pull request for this feature (track progress of this pull request on Github here: https://github.com/miniflycn/url-download/pull/4) to actually get what I needed.
So if you have suffered because of the same then hopefully this pull request will get merged soon and you can use it. Alternatively, if you know a solution for the above (no, no, I DON’T want to write my own HTTP handler!) please drop me a line and let me know.
Heads up on this, this pull request has been merged on Sun 06-Nov-2016 so expect any version after 0.0.5 to have this.
I have actually enquired as to the next release and you can track the conversation regarding this yourself in this Github issue: https://github.com/miniflycn/url-download/issues/1
To save you from checking that URL, version 0.0.7 is now out with my fix.
Ooops it seems version 0.0.7 is broken folks so pick version 0.0.8+