the thing I was doing is on the audio, i was setting the pts (presentation timestamp) after the frame was filtered by ffmpeg, which could be frames later, and could be batched.
i set the pts before filtering, but that made things worse. for some reason, the audio filter is remapping the pts I use (frame count, 1/60) to the time base 1/44100.
the video filter does not do this, which is why I specifically remap it later
I had to make that remap step I added later skip the audio channel