For a friend, I recently researched which notebook can be recommended for Ultra HD video editing (4K UHD, 3840 × 2160 px). Here is, in short, what we found.
First priority: Intel Core i7-7xxx CPU, as fast as possible
There are three major ways to encode video: with the processor in software (Linux libx264 and libx265 libraries), with the CPU in hardware (using Intel's or AMDs dedicated features for that), or with the GPU in hardware (using Nvidia's NVENC mechanism). The hardware based mechanisms are much faster. For example, one comparison test was 55 fps on a i7-5930K CPU and up to 540 fps on a NVIDIA GeForce GTX 980 GPU [source]. So a factor of 10 can be expected.
However, hardware encoding is somewhat limited in terms of features, so the same video quality will have a bigger file size, or for the same file size, you get less quality. For example, Nvidia NVENC supports B frames (some kind of small, compressed frame type that reduces video size by approx. 30% at the same quality) only in H.264, not in H.265 [source]. So the enthusiast video editor will probably want to do the final encoding runs in software on the CPU, which can be 20 times slower, but gives better quality for the same file sizes. Preview versions can still be created with the CPU or GPU hardware support, but do not need as much computation power as resolutions will be lower. Also, reportedly the primary, most powerful competitors are Intel CPU hardware based encoding on Kaby Lake processors (Core i7-7xxx) and GPU Nvidia Pascal GPU based encoding on Nvidia GeForce GTX 10xx graphics chips, and the best of both are approximately equally powerful. There seems to be better software support for the Nvidia solution though (but that is just a rough impression).
As a consequence, when you want the best quality and accept a long final coding run in exchange, the graphics board does not really matter much, it "only" has to be suitable for playing back 4k video and perhaps applying some live effects on them. So even a "previous generation" (Maxwell based) Nvidia GTX 960 will do, as some models have in this list of 4k editing notebooks. You will however want the best and fastest CPU you can get, which (in gamer notebooks) seems to be approx. the Intel Core i7-7600HQ (2.8 GHz).
Second priority: display
The next question is what display to get. Choices are between 15" and 17" displays, and for both between Full HD (1920×1080) and 4k Ultra HD (3840×2160) resolutions. Without a 4k display, you obviously can't watch your 4k video in full glory while editing, but even with a 17" 4k display, pixels are so small that there is very little to no optical difference between a Full HD and a 4k Ultra HD display (as reported by gamers). You will have to zoom into frames to see quality differences anyway. But the price difference is sometimes just 200 EUR, which might make the 4k Ultra HD display worth having.
Third priority: main memory, mass storage
These things can be upgraded as needed, so you don't do a final decision on purchase. 16 GB DDR4 RAM and a "128 GB SSD plus 1 TB hard disk" combination seem a reasonable minimum though. To speed things up, the SSD should be sufficient for the operating system, software, and the video files of your current project, while the (cheaper and larger) hard disk would hold all the archived video editing projects.

Model recommendations
The most interesting models (high performance but at the lower end of the possible price range) that we found are these:

  • HP Omen OMEN 17-w207ng, 1500 EUR, i7-7600HQ CPU, Nvidia GeForce GPX 1050 Ti, 17" display 3840×2160, 256 GB SSD, 1 TB HDD
  • HP Omen 15-ax202ng, 1300 EUR, i7-7600HQ CPU, Nvidia GPX 1050, 15" display 1920×1080, 256 GB SSD, 1 TB HDD
  • Dell XPS 15 9560, ca. 1600 EUR, i7-7600HQ CPU, 15" display 1920×1080

More interesting information and sources

For a crisis-mapping project after the 25 April 2015 earthquake in Nepal, we needed a collaborative online database that is easy to set up and maintain. I found the product on to be the best existing solution. Much much better than the misuse of Google Spreadsheets that people usually engage in.

However, the Obvibase software is not free software, and it is not perfect either. So here's my list of improvements that I just submitted to them as feedback, after a month of frequent and in-depth usage of their database. If somebody wants to do so: a free software clone of plus the suggestions below would get us pretty close to the best collaborative ad-hoc database solution ever smiley

  • Alt+Arrow Left should work for "page back", but does not.
  • The "Main menu → More actions … → Restore …" action's form should list the person who has done an edit in another column, and also what changes were done (showing the before and after version of the affected cell).
  • For better privacy protection and because people are used to it already, sharing should work like in Google Docs (adding Google Accounts who get access, with different access levels per account). Otherwise, sharing the access link accidentally gives people read-write access, which cannot be revoked again.
  • The "Page Up" and "Page Down" keys should work in the list view.
  • Pressing the space bar on the first selectable column (with the double arrow) should select the record, adding a checkmark to the very first column.
  • It should be possible do do simple styling of columns (background color, font color, bold font, italics font). Then, one can mark up one column as more important, and speed up visual navigation.
  • Column titles should be formatted in bold and / or with a background color, to speed up visual navigation on the screen.
  • There should be a way to create a new database in a new tab. If the menu link is a normal link that can be opened in a new tab, it will be enough.
  • There should be a sharing mode where people with the link can add records and edit or delete their own records, but noth others. It will have to require login with a Google Account. Because, the general read-write access mode link can not be shared publicly to avoid access by destructive individuals who might delete every record, or overwrite it with spam.
  • In the web-published, read-only version, long text cells should have an on-hover box showing the full text, as done for all other versions.
  • It should be possible to mark individual columns as non-public in the column settings, which would exclude them form display in the public version that is read-only accessible to all web visitors. This would allow to collect public and confidential information (like, personally identifiable information) in the same database.
  • Full export to CSV. There is currently no simple way how to export and re-import the full database incl. all nested records to one or some CSV files. It is however needed for making local backups. The problem is that exporting the main database table misses out records from any nested table, and exporting a nested table misses out records from any other nested table, and also any record from the main database table that does not have a record in the exported nested table.
  • CSV exporting from nested tables should not repeat records from the main table. This happens currently, but redundancy is most always a bad idea. Instead, the nested record should refer to the ID of its parent record ("foreign key relation").
  • SQLite3 export. Would be good for offline usage, as an interface to other tools, as a way to run complex SQL queries on the dataset, for comfortable backups (unlike CSV export, which misses out some data) etc..
  • SQLite3 import, including reconciliation. To allow people "in the field" to contribute without Internet access, there should be an SQLite3 import feature. Concurrent edits would have to be reviewed before finishing the import.
  • ODS export. Not that important, but would be good for having an alternative to the SQLite3 format export. The export would include multiple sheets, one for the main database and one for each nested table. Embedded scripts would be used for filtering when following a link from the main database table to the associated records in the nested table.
  • When pressing Ctrl+V in a cell while it is not in edit mode, the current clipboard content should be entered into the cell. Currently, the Paste / Import window is shown, which is confusing because Ctrl+V is a clipboard operation in all applications, so people will use it intuitively (forgetting to go to "Edit mode" before). In the current way it functions, there are also two issues: iun addition to showing the Paste / Import window, currently "v" is inserted into the cell when pressing "Ctrl+V", and the focus in on the cell rather than in the text field in the Paste / Import window (which prevents the Esc key to be usable for closing that window).
  • When trying to change the filter value for a text column by clicking on the header, the caret is initially at the beginning of the value. It would be more useful to position it at the end. More importantly, even though the caret is in the filter value text box, the "Pos1" and "End" keys navigate to the first and last menu entry instead. They should however move the cursor.
  • To reset a text column to "no filter value", it should also be possible to click the column header, delete the filter value in the text field and then press the "Apply" button or Enter key. Because that is what people intuitively expect. Currently, a message will appear saying "No text to search for".
  • Search by transliteration and equivalent Latin characters. To find the records one is looking for fast, there should not be a need to enter special characters. For example, when a record contains the name "Matjaž", it should be possible to find it via "Matjaz" and also by any transliteration of "Matjaž" to basic Latin characters.
  • In case of a synchronization error, currently a message will appear that "some records have been rolled back". This means data loss, maybe for max. 30 s of editing. It is not due to concurrent edits, but rather due to intermittent connectivity problems. Instead of causing data loss, another solution should be possible. Like for example, asking if syncing should be tried again, and finally offering the rolled back data in CSV format to copy and add it again quickly.
  • The database becomes hard to navigate if there are many columns, exceeding the screen width. For this quite frequent case, the database should allow several lines per record, distinguishing between vertically stacked columns by using different text formats.
  • The textbox showing values for multiple-choice columns has too big paddings left and right, causing line breaks where there should not be any. Also, it should be always at least the same width as the column itself, for the same reason.
  • When changing selectable values in the settings dialogue for multiple choice columns, it should be possible to delete existing values and create new ones, or to edit existing values (which will edit all records that use the old value accordingly, saving a lot of work).
  • For powerusers, the filter field when clicking on column headers should allow regexp searches (and storing them in a "recent regexp searches" sub-menu). This can help with analysing columns that have comma-separated values, hierarchical text values and other constructions in them.
  • URLs in all text columns should be automatically made into clickable hyperlinks (and automatically shown abbreviated when not in edit mode). Else, it's additional work to go into edit mode and select and copy the URL manually.
  • Subtable records should be reachable with no or very little additional work. This is esp. important when storing contact information there, as it frequently contains hyperlinks. Proposal: When hovering over the "[…] records" entry that links to a subtable, an on-hover window should appear that shows the subtable records, incl. columns, hyperlinks etc..
  • Currently, reverting to a prior version (via "Main menu → More actions → Restore …") will lead to a database that uses a new URL, without notifying the team about this. This means starting a new branch from the version one reverted to, without knowing, and without a way to merge changes into the main branch that the rest of the team is working on. This should be considered a bug.
  • It would be good to be able to see who's editing the database at the same time. Just like in Google Docs and Google Spreadsheets, where there are user icons showing up in the top right then. It could be the same user icons as used for the Google accounts, since login happens via a Google account anyway.
  • In the form to create a comment, it should be possible to press "Ctrl + Enter" instead of clicking "Save" with the mouse. It's faster. And like it's done in the Google Doc comments as well.
  • For more comfortable and faster visual navigation in long tables, it should be possible to style each column in the column settings separately. A combination of several options for influencing the style would be available: font size, font weight, font color, overflow / line break behavior etc..

Downloading your OSMAnd~ maps to a computer and installing them to your Android phone from there has several advantages:

  • Use cheaper or faster network access. In a case where, for example, there is no wifi available but your computer is connected by wired Ethernet network, downloading the maps to your computer is probably much faster and cheaper than using your mobile data plan. (An alternative is configuring tethering at your computer. USB tethering and Bluetooth tethering would work but wifi tethering can be a challenge to set up since most wifi hardware in notebooks does not support access point mode, and Android might not support ad-hoc mode.)
  • Avoid tracking. You avoid being tracked by the OSMAnd~ maps server, which else "will send your device and application specs to an Analytics server upon downloading the list of maps you can download" [source].

So, how to do this?

  1. Just download the relevant zip archives from the OsmAnd maps index.
  2. Connect your Android phone and unpack the downloaded zip archives into the osmand folder on the SD card. (Alternatively, use adb push filename.ext /sdcard/0/osmand/ according to these answers).
  3. Start OsmAnd~. It should read and index the files (which for the first startup will take a bit of time).

One of my websites was constantly throwing "Internal Server Error" errors, and that error appeared as follows in /var/log/apache2/error.log:

Thu Oct 16 17:59:14 2014 (19446): Fatal Error Unable to allocate shared memory segment of 67108864 bytes: mmap: Cannot allocate memory (12)

And that even though 9 GiB of memory was free. Also, only one website was affected, others did run fine. The error appeared even when requesting files that did not exist at all (independent of file type: JPG, PHP etc.). After ten minutes or so, the error would disappear on its own, for a while. Also, after restarting Apache the problem disappeared, for a while.


The problem was due to hitting the system limit of shmpage (shared memory allocation), similar to this case. This limit effectively does only exist in virtualized (VPS) environments, not on physical machines. Hitting this limit can be confirmed by running "cat  /proc/user_beancounters", which in our case would output right after the above error situation: shmpages held=247122 [...] limit=262144 failcnt=10067. At 4 kiB page size, the max. allowed 262144 shared memory pages correspond to (262144*4096)/1024^3 = 1 GiB of shared memory.

When restarting Apache, the problem would be temporarily resolved because shared memory usage would initially be lower: it reduced the number of shared memory pages held form about 247000 to 148000, as per cat /proc/user_beancounters.

The number of 67108864 bytes of shared memory to allocated, mentioned in the error message, gives a hint what consumes this much shared memory: it is just the default value of PHP's opcache size of 64 MiB, configured in /etc/php5/cgi/php.ini (because 67108864 / 1024 / 1024 = 64). The problem is not that PHP-FCGI would try to allocate 64 MiB of opcache shared memory once, or once per site, but once per process [source]. PHP-FastCGI processes are reused for several requests [source], so the opcache caching makes still some sense, contrary to this opinion. However, they are relatively short-lived as their number varies depending on site load, so the caching does not add much benefit. And worse, the opcache caches are not shared between the PHP-FastCGI processes, but each one gets its own. With about 10 processes per site, each consuming 64 MiB of shared memory, we quickly hit the 1 GiB shared memory limit of the VPS, like above. To illustrate, these were the values in my case: the free command indicated the following shared memory usage:

  • 69 MB some seconds after stopping apache2
  • 136 MB immediately after starting apache2 (service apache2 start; free)
  • 500 – 700 MB some 30 – 180 s after starting apache2
  • 989 MB typically when apache2 is running for a long time, very close to the 1 GiB limit already

Nearly all this shared memory usage is created by the php-cgi processes, as can be seen in top output (use "b" to toggle to background highlighting, "x" to toggle to highlighting the sort column, and "<" / ">" to switch to SHR as the sort column). Namely, when 991 MB shared memory are consumed, 620 MB (= 9 * 64 MB + 2 * 40 MB) of this was consumed by 11 php-cgi processes.


The solution is to use PHP-FPM instead of PHP-FastCGI [source]. Contrary to that source, this works independently of the webserver, also working with Apache2.

After deploying this solution, you can compare the "Zend OPcache" section of phpinfo() scripts from different sites on your server. As you can see from the "Cache hits", "Cache misses", "Cached scripts" and "Cached keys" numbers, there is only one single OPcache for all your PHP sites. However, you can configure opcache parameters different for different sites (like memory usage etc.), and this is also reflected in the phpinfo() output. I can't really make sense of that so far, but assume that memory usage etc. are indeed configured per PHP-FPM process pool. So 128 MiB for one site and 64 MiB for another would allow for 192 MiB total shared memory usage for opcache.

The shared memory situation after switching the site to PHP-FPM was only a 70 MiB max. difference in shared memory between Apache2 and php-fpm running and not running, compared to the 900 MiB earlier. This was generated by five php-fpm processes running simultaneously, with 40-50 MB shared memory "each". So clearly, the shared memory is indeed shared between php-fpm processes instead of each having its own.

(Another tip: there are other important ways to optimize OPcache, see this blog post.)

Working solution

Finding this solution was quite a nightmare. But, here it is.

This assumes you have a working setup of Panels in Drupal 7 already.

  1. Use a group path prefix. Set up pathauto to include your group's path as a prefix for group content. This is a challenge by itself due to various bugs and changes in Drupal, but here is my solution.
  2. Edit the node template panel page. In the CTools page manager (at /admin/structure/pages), enable the node_view (node template) page, if not already done, and then edit it (at /admin/structure/pages/edit/node_view).
  3. Create a new Panel variant.
  4. Create a selection rule by path. Click "Selection rules" in teh left sidebar for your new Panel variant. Then select from the list "String: URL path", add it, and configure it to use your group's path prefix with a wildcard, like groupname/*.
  5. Reorder your variant. Click "Reorder variants" at the top right (should be /admin/structure/pages/nojs/operation/node_view/actions/rearrange). Reorder the variants to make yours come after "Node panelizer" and before "Fallback panel". Drupal will work through the vriants and select the first where teh selection rules match, so placing it after "Fallback" would make it never show up.
  6. Test. Go to some non-panelized group content that has your group's path prefix, and see if the panel variant gets applied. To see an effect, you will have to add content to the panel variant of course.

Alternative solutions

The following solution should also work:

Panelize display modes, set display mode per group. This is a combination of the following:

  • ds_extras: needed to create additional view modes for content types
  • context: for general context management, needed by context_og
  • context_og: needed in context to detect when a node belongs to a certain group
  • contextual_view_modes: to set a view mode for a content type based on its context
  • panelizer: to panelize a view mode

You would define a context for each group for which you want a default panel, triggered by a node belonging to the group. Then you would configure the content types (with the options added by contextual_view_modes) to set a specific view mode for each of these contexts, which you created via ds_extras. And use Panels (or rather Panelizer I think) to create default panels for these new view modes.

This solution is semantic, since it is not dependent on URL patterns. However, it is also more complex, clutters the view modes namespace, and requires one new panel for each content type / view mode combination, rather than just one in the node template.

Non-working solutions

The following solutions that I tried did not work:

Selecting a Panel variant by group membership. This should work – see my instructions here. But currently (2014-09) it does not due to Drupal issue #2242511 "How to create panel variant with selection rule for groups audience field". There is an impractical workaround available, but "somebody" should go in there and fix it for real …

Using og_panels. This would be the most comfortable variant, as it avoids also the need to be admin to change a group's default layout. However, there is no Drupal 7 port of the og_panels module. See: Drupal issue #990918.

Selecting panel variants via contexts. Proposal: Create a solution from a combination of the following:

  • context: for general context management, needed by context_og. See:
  • context_og: to detect when a node belongs to a certain group. See:
  • panels: to set a panel variant in the Panels selection rules, based on the context detected via context_og. See:

Problem: Panelizer does not list the context / context_og contexts in the list of contexts of the "Context exists" selection rule. This is because these are simply two different things called "context": the Panels module does not even depend on the context module, nor vice versa.

The "Context exists" selection rule will list the "Panels contexts" that are defined in the context section above the selection rules section, if and only if that context field contains data. See this wording in the "Context exists" selection rule: "Check to see if the context exists (contains data) or does not exist (contains no data).", and the reproduction instructions from Drupal issue #1263896. This means that in effect, "context" in Panels means the same as "contextual filters" in views: it is a way to pass in arguments that can be grabbed by the view resp. the content elements on the panel, and using them in "Context exists" is rather a side use.

Using context_panels_layouts. Proposal: Create a solution from a combination of the following:

  • context: for general context management, needed by context_og
  • context_og: to set a conext when a node is part of a certain group
  • context_panels_layout: to set a panel as a context reaction
  • panels: to set a panel variant in the Panels selection rules, based on the context detected via context_og

Problem: This does not work, as context_panels_layout can only be used to set Panels layouts, not actual panel variants that also include the content added to the layout.

Powerline is a technology for transmitting data over the AC grid. All devices provide an RJ45 Ethernet plug (and some also wifi and USB), so they support client devices of all operating systems. Basic configuration of encryption is likewise OS independent, as nearly all of them use a "pairing" button for that. However, to set the encryption key manually, to make more detailed settings, to read out speed statistics and to update the firmware most come with configuration tools that only work under Windows.

So, can we find a device that can be fully managed under Linux, ideally with free software? After a long search, this turns out to be a simple task. Just use any device that supports the Homeplug AV or Homeplug AV2 standard, and manage it with Open PLC Utils, a free software full-featured management tool developed by AV/AV2 chipset maker Atheros. It  allows full device configuration, info reading and upgrading for all INT6000/6300/6400/AR7420 devices, even tampering with the parameter set in the firmware. The only alternative are DS2 chipset based devices, since at least most of them can be fully configured via a built-in web interface (such as the COMTREND Powergrid 9020, their most modern device). However, a tool like Open PLC Utils is preferred here since it effectively replaces a part of the DS2 management software with free software, allowing scripting, modifications etc., and is more powerful in general.

Recommendable devices

It is said that it does not matter so much in terms of connection speed and quality which exact device you choose, as long as you choose the right chipse. Here, we focus on the Homeplug AV chipsets (INT6000, INT6300, INT6400) and there on the most modern one (INT6400). We avoid the Homeplug AV2 chipsets because they seem not yet mature (see below) and the DS2 chipsets because they are not as readily configurable under Linux.

Recommendation for Germany: Speedport Powerline 100. The best recommendation is so far: Telekom Speedport Powerline 100. It has the INT6400 chipset (most modern one for the Homeplug AV 200 Mbit/s devices), and it is dirt cheap (ca. 20 EUR incl. shipment for a pair of them, via because Telekom made Germany awash with these devices. It also is good quality, since it is a whitelabel product from well-known French brand LEA Networks – see also the very detailed test on (in German). One can use the LEA-made software tool SOFTPLUG for settings beyond simple pairing for encryption, but it runs only under Windows. It is not needed however, as the free software tool Open PLC Utils can likewise be used, also running under Linux.

Discussion of alternatives: Devolo dLAN AVsmart+. These devices are quite nice, as they have an extended status display on the device – compare the manual. They are also readil available in used condition on eBay, but not as dirt cheap as the Speedport Powerline 100. However, they use the older INT6000 chipset [source], leading to a third less speed in long-range applications [source].

Discussion of alternatiives: COMTREND PowerGrid 9020. This is not a Homeplug AB/AV2 device, but uses the non-interoperable DS2 chipset. Reviews on Amazon are good. However, it can be fully managed under Linux as all settings are available within a web interface, as argued for and against above. The Powergrid 9020 is the most modern Powerline device of Madrid-based manufacturer COMTREND so far, and avaible in UK and Europe type plug versions. The UK plug versions are very readily available online, as they are provided by UK based large ISP Britich Telecom (BT) to their customers. The Europe plug versions are quite hard to find though (but here are some). Also see the installation guide and the  full manual.

Discussion of alternatives: Homeplug AV2 devices. There are the even more modern 500 Mbit/s Homeplug AV2 devices (using the AR7420 etc. chipsets), however as of 2014-09 this generation of technology seems to not be mature still, often suffering from connection breakdowns, low throughput, the devices running hot etc. (as judged from reviews on Amazon). So we avoid them here, also because they are still more expensive on the second-hand market. But your priorities and mileage may vary. They also can be fully managed with Open PLC Utils.

Linux software for powerline adapters

While Open PLC Utils is clearly the winner, the following list is all the Linux-based powerline software I came across:

  • Open PLC Utils. As said, clearly the winner: full-featured and free software.
  • Faifa. A manufacturer independent, free software Homeplug AV/AV2 utility for Linux. Allows low-level functions such as control frame dumps. Considered to be the successor to the older plconfig utility, see below.
  • plconfig. An older, simple Linux based configuration utility. For download on Github. Superseded by Faifa now.
  • dlanlist, dlanpasswd. Open source software that can be downloaded and compiled under Linux and is meant to list (and set the password of) Devolo dLAN devices. Since these conform to the Homeplug AV/AV2 standard, the software might be used to also configure other devices, after some simple modifications.
  • devolo Cockpit. A large configuration softwre for Devolo dLAN devices. Seemingly not available as a source version, so adaptations to other devices are not possible.
  • Intellon device manager for 3.x firmware (Windows software). Allows full management of INT6x00 chipset devices when using a 3.x firmware version, incl. editing firmware parameters. Its use is described in this article.
  • Intellon device manager for 4.x firmware (Windows software). In contrast to the version for 3.x firmwares, this does not support full management of the of INT6x00 chipset devices any more.

Background information

The following in-depth articles provide relevant background information about Powerline technology and their successful use:

Modding and hacking powerline adapters

At 10 EUR for a used device or less, powerline adapters are so cheap that they offer themselves to several non-standard uses. At least teh following come to mind:

  • Powerline noise filters. Devices that include an AC plug with noise filter (such as the SpeedportPowerline 100 recommended above) are the cheapest option to use them as frequency filters to prevent contamination of the 50 Hz frequency from other digital devices, mobile phone chargers etc..
  • Powerline bridge. It is also possible to create a simple bridge by connecting two of them with a crossover Cat5e cable (or a bridge device in between, if necessary). This should allow crossing phase boundaries in the home cabling, if necessary.
  • VDSL P2P modem replacements. And then, of course, these cheap devices are a natural candidate for creating a cheap DIY replacement for pairs of VDSL modems (usually 150 EUR per piece!). It works by connecting twisted phone wire to the signal (via soldering) before it gets modulated on the AC mains power. It is possible to use this for transmitting data over several hundred meters at least.


When you use FTP for recursive downloads or uploads and the FTP server has a firewall installed, you might get blocked due to "too many connections". Nearly all of those connections would be in TIME_WAIT state, and are only visible on the server side (not when you check on your client with e.g. netstat -anp --tcp | grep <ip-address>). The exact threshold of "too many" depends on firewall configuration (but can be something like 300 – 800). You will probably see no error message at all, and cannot reach the FTP server any more, either temporarily or permanently. What causes this problem, and how to avoid it?

First, this is a completely normal feature of the FTP protocol in passive mode: "Some existing protocols, such as FTP, make use of implicit signalling, and cannot be retrofitted with TIME_WAIT controls." – source; additionally, FTP does not provide for TCP connection reuse as HTTP/1.1 does, so we are indeed stuck with using one TCP socket for transferring one file via FTP). Perusing lots of server ports is not nice of the FTP protocol though (they stay in TIME_WAIT for up to four minutes!), since in extreme casesit can lead to the server runnig out of ports in the ephemeral port range for incoming connections. That's why the firewall is jumping in …

To understand how to work around this, we first have to know a bit about FTP active / passive mode and TCP TIME_WAIT:

"In active mode, the client establishes the command channel (from client port X to server port 21) but the server establishes the data channel (from server port 20 to client port Y, where Y has been supplied by the client).
In passive mode, the client establishes both channels. In that case, the server tells the client which port should be used for the data channel."
[Stackoverflow answer by paxdiablo, CC-BY-SA-3.0]

Now, TIME_WAIT is a state that a socket enters for around 4 minutes after a TCP connection was closed cleanly. It exists for two reasons (preventing delayed segments from being misinterpreted as part of a new connection, and reliable full-duplex connection termination, as explained in a great article in detail).

Fix 1: Use active mode FTP

The usual recommendation (like here) for avoiding being blocked by a firewall due to TIME_WAIT connections is to switch from passive to active FTP mode. Here's my (preliminary) understanding of why this works:

In the FTP protocol, one TCP data connection is used per file transfer, in both passive and active modes. However, in active mode the port used for these TCP connections on the server side is always 20, while in passive mode it is a random port number for each new connection, which will then be in TIME_WAIT state for 2-4 minutes after the connection ends. That is why "the downside to passive mode is that it consumes more sockets on the server [and] could eventually lead to port exhaustion" [source]. Not exactly though: a socket in TIME_WAIT state does not have to block the whole port from reuse, but only the exact socket (which  is a combination of two server addresses and two port numbers), though many operating systems naively implement it to block the whole port [source]. And in line with this naive way of implementing it, firewalls naively assume that the port is still blocked (even if it's not) and hence consider TIME_WAIT connections under somehow active connections. Now if you switch to active mode FTP, the server uses port 20 for all the outgoing data connections, with different sockets for each though. So the same number of sockets are consumed, and the same number are in TIME_WAIT state for the same amount of time, but since they all belong to one port, the firewall will not see it, and not block you.

This is obviously a relatively crude way of solving the issue. A better way would be firewall whitelisting (if accessible to you), or a deep inspection firewall with rules not counting TIME_WAIT connections resulting from passive mode FTP.

Fix 2: Limit the number of transfers per minute

If a firewall configuration allows us (say) 400 connections, and sockets stay in TIME_WAIT state for 4 minutes, it allows for 100 file transfers (each with its own TCP connection) per minute without being blocked. Technically, recursive download commands could support such a limit, but I am not aware of any FTP client that does. Not even lftp has this: it has the net:connection-limit option, but that is only for active connections. A connection that is in TIME_WAIT on the server is considered closed by the client already, so will not count into this limit.

However, a relatively nice approximation is this: Use the mirror --script=FILE command to just generate a script for downloading the files, file by file. Edit that script and insert lftp sleep 4m commands after a number of files that are close to triggering the firewall blocking. Then execute that script in lftp. All that could be wrapped in a nice script as well …

As an approximation, one might perhaps use a very restrictive data rate limit option to stay below the "files per minute" threshold. In lftp, that would be net:limit-total-rate (compare its manpage). Some people also reported that limiting the max. number of parallel active connections to one also helped [source] – it works the same way as rate limiting, but is far from guaranteed to help in your case, of course.