Ben Schmidt’s terrific insight into the assumptions that the Fourier transform imposes on sentiment data has been sinking in, and I have a left-field suggestion for anyone who cares to check it out. I plan to investigate it myself when I have the time, but I’ve decided to broadcast it now.
In the imaginary universe of Fourier land, all texts start and end at the same sentiment amplitude. This is clearly incorrect, as I see it.1 But what could we say about the beginning and end of texts that might hold up?
One possibility is that all texts might start and end with a flat sentiment curve. That is, at the very beginning and end of a text, we can assume that the valence of words won’t shift dramatically. That’s not clearly incorrect. I think it’s even plausible.
Now consider how we talk about plot most of the time: we speak of rising action (slope positive), falling action (slope negative), and climaxes (local and global maxima). That’s first derivative talk! And the first derivative of a flat curve is always zero. So if the first derivative of a sentiment curve always starts and ends at zero, then at least one objection to the Fourier transform approach can be worked around. For example, we could simply take the first finite difference of a text’s sentiment time series, perform a DFT and low-pass filter, do a reverse transform, and then do a cumulative sum (i.e. a discrete integration) of the result.2
What would that look like?