So... if I see something like "Y�" which backwards equates to 59 EF BF BD... what was this before the HTML conversion? Who is changing that data in transfer? And why?
Answer a) The "EF BF BD" (or converted FFFD) seems to be the "unicode replacement character". It is inserted to indicate a non-printable char has been exchanged/removed. Yes... removed - the previously unsupported byte is gone for good.