• Please make sure you are familiar with the forum rules. You can find them here: https://forums.tripwireinteractive.com/index.php?threads/forum-rules.2334636/

PC KF2 Linux Dedicated Server: first-connection delay, "Flushing async loaders" [v1053]

notblackout

Member
Aug 6, 2016
7
2
75
KF2 Linux Dedicated Server: first-connection delay, "Flushing async loaders" [v1053]

Category: Linux DS

Reproducibility: Always

Versions Tested: v1053, v1054

Summary: The first client to connect to a Linux DS after startup or mapchange is subjected to an approximately two-minute delay. Subsequent connections are instant.

Description:

I'm on the steam mainline channel (not the beta channel).

This happens in every configuration I've tried. I iterated through a dozen or so config option tweaks, hoping for something that might affect this behavior, to no avail. To simplify reproduction, I then reverted to a default config as follows:
  1. Killed server
  2. Moved my KFGame/Config/LinuxServer*ini files out of the way (into a safe directory under ~)
  3. Restarted server to let it regenerate config files
  4. Killed server again, set GamePassword=foobar
  5. Restarted server
I started the server using this script:

Code:
#!/bin/bash

Binaries/Win64/KFGameSteamServer.bin.x86_64 kf-outpost?difficulty=3?gamelength=2?bIsDedicated=true?bIsLanMatch=false?MultiHome=45.33.113.137

I then connected with the mainline KF2 windows client. In the process of accepting my client, the server spends a two-minute interval (!) between "Flushing async loaders" and "Flushed async loaders". I ran this test four times, and I also saw intervals of 106, 107, and 109 seconds.

During this flushing interval, my client sits at the main menu, with no indication of progress towards joining the server (but also not frozen). My client eventually timed out before "Flushed async loaders" was logged on the server.

After "Flushed async loaders" appeared in the server logfile, I reconnected and got in immediately. I started the game and played through wave 1 with no problems and low resource utilization. Then I disconnected in trader time.

Here is the server-side logfile and contemporaneous vmstat output on the server:

Server logfile and vmstat output: https://gist.github.com/notblackout/...19e3213317b7b0

The vmstat output attached shows this host has resource headroom to run the actual game, and shows "flushing async loaders" is pegging a CPU while leaving the disk idle and exerting no memory pressure. We never even get close to swap.

Here's a snippet from the server log during my first connection.

Code:
[0032.52] DevNet: Login request: KFMainMenu?Name=blackout?Team=255?Password=foobar
[0032.52] ScriptLog: KFGameInfo.MakeReservations
[0032.52] ScriptLog: KFGameInfo.ReadReservations reading reservations from ?Name=blackout?Team=255?Password=foobar
[0032.52] ScriptLog: KFGameInfo.ReadReservations for player 0x011000010070AB0A
[0032.52] ScriptLog: AddPlayerMatchmakingGroup: Not creating a group for a single player.
[0032.52] ScriptLog: KFGameInfo.UniqueReservationAdd 0x011000010070AB0A
[0032.52] ScriptLog: KFGameInfo.UniqueReservationAdd adding 0x011000010070AB0A
[0032.52] DevNet: Login process paused, waiting for ResumeLogin...
[0032.72] DevOnline: VerifyClientAuthSession: ClientIP: 104.54.197.17, ClientUID: 76561197967649546, Ticket CRC: E59B7430
[0033.12] DevOnline: ClientAuthComplete: bSuccess: 1, SteamId: 76561197967649546, ExtraInfo:
[0033.12] ScriptLog: Client passed authentication, UID: 0x011000010070AB0A
[0033.12] DevOnline: CreateServerAuthSession: ClientAddr: 104.54.197.17, ClientUID: 76561197967649546
[0033.12] DevOnline: CreateServerAuthSession: CurAuthTicket->FinalAuthTicket.Num(): 64, SessionUID: 2
[0033.12] DevOnline: CreateServerAuthSession: Ticket CRC: 34C4976B
[0033.12] Log: Flushing async loaders.
[0160.67] Log: Flushed async loaders.
[0160.68] ScriptLog: KFGameInfo.ConfirmReservation
[0160.68] ScriptLog: KFGameInfo.ConfirmReservation for player 0 0x011000010070AB0A
[0160.68] ScriptLog: KFGameInfo.TimeReservations 0 0x011000010070AB0A timer is -1
[0160.68] DevOnline: Refreshing published game settings...
[0160.68] DevOnline: Server data: Ver: 9189, Ded: 1, Region: 255, Slots: 6, Pass: 1, Server: Killing Floor 2 Server, Map: KF-Outpost, Players: 0 Bots: 0, Game: KFGameInfo_Survival

While the server was busy "Flushing async loaders" (or whatever that represents), the process was busywaiting on a futex, pegging a CPU core to do nothing. Attaching strace,we see what looks like userspace spinlocking... the process is clearly waiting for something it expected to happen immediately, but it's taking longer than expected:

Code:
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
futex(0x7f8ebc011ce4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f8ebc011ce0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0
sched_yield()                           = 0


The only saturated resource during this time is the CPU. Note that CPU utilization never gets above 20% when I'm actually playing, but whatever "Flushing async loaders" does entails 99% utilization and no disk or memory activity. Because the machine only runs the KF2 DS, every sched_yield succeeds, but then the kernel just returns control right back to the yielding thread that's blocked on a futex, artificially inflating CPU usage.

This server is hosted at Linode. It does nothing except host the KF2 DS. I just created this VM tonight for this purpose. Its iptables are currently wide open -- I've set all chains to default policy ACCEPT and cleared their rulesets. Specs:
  • 30 GB SSD
  • 2 GB RAM
  • 1 processor (Xeon E5-2697)
  • Debian 8.7
The machine is under no memory pressure. It has a gigabyte just sitting around doing pagecache duty, and it never touches its swap. The block devices are also essentially idle.

Please tell me if there is a way to get better debugging information here. Adding -debug to the server command line seemed to do little or nothing. I could not figure out what piece of code is logging "Flushing async loaders". Recursive grep in the SDK did not match anything, and using strings on likely suspects in the binaries and kfgame directory did not match either. I have no idea where this is coming from right now, or what it means. I'm also aware that it could be red herring. Perhaps the "flushing async loaders" code is invoking something else internally that does all the busywaiting, but silently.

EDIT: I doubt anyone from TWI will want to bother trying to repro on my machine, but in case you do, just post an ssh public key and I'll give you root. The machine's sole purpose was to test KF2 DS, so you can do whatever you want to it, even if it bricks the thing.
 
Last edited:
The v1054 patchnotes didn't mention this, but sometimes the first response to a bug report is "upgrade and then try again", so I figured I would check whether this problem still exists. It does.

I just updated my server and got the same behavior.

New compile timestamp: "Init: Compiled (64-bit): Jun 21 2017 11:54:19" (coincided with client v1054)
Old compile timestamp "Init: Compiled (64-bit): Jun 13 2017 10:03:38" (coincided with client v1053)

I can't edit the post title to update "[v1053]", so I instead added a note to the editable post body.
 
Upvote 0
Is there any update on this? I had expected the devs to recognize notblackout's elaborate report with at least a short answer, like "we'll look into it."

This issue is very annoying and definitely affecting "regular" players that are not involved in that whole server thing, since I also host servers provided to the public and takeover systems.
 
Upvote 0
I installed a fresh machine, no extras, nothing. A simple SSH access and that's it. Patched the packages, installed steamcmd and kf2. Done.

Did that on Debian 8.8 and Ubuntu 16.04. After installing, tweaked the config files in server names and kfwebadmin. Nothing extraordinary. I think I'm gonna try it again with 14.04 since there is nothing else I could go after right now.
 
Upvote 0