would be divided up into:
procedure tsKeyWord tsSpace TForm1 tsIdentifier . tsSymbol FormCreate tsIdentifier ( tsSymbol Sender tsIdentifier : tsSymbol TObject tsIdentifier ); tsSymbol tsSpace {Create Form} tsComment <CR><LF> tsCRLF How is it Done?The RichEdit control normally loads preformatted text from .RTF files by way of by of the RichEdit.Lines.LoadFromFile() function. YourPasEdit uses the RichEdit.Lines.LoadFromStream() function to load the file from a TPasConversion - a custom TMemoryStream descendant. This stream takes the plaint text Pascal source file, loads it into its internal memory buffe, and then converts it from plain text to a text impregnated with RTF codes. This way when it is loaded into the RichEdit control via RichEdit.Lines.LoadFromStream the Pascal source file appears in the control color-syntax highlighted.
To the main Editor, this process is transparent - the code looks something like this:
NewRichEdit := TRichEdit.Create;
PasCon.Clear; // Prepare the TPasConversion
PasCon.LoadFromFile(FName); // Load the File into the Memory Stream
PasCon.ConvertReadStream; // Convert the stream to RTF format
NewRichEdit.Lines.BeginUpdate;
NewRichEdit.Lines.LoadFromStream(PasCon); // Read from the TPasConversion
NewRichEdit.Lines.EndUpdate
NewRichEdit.Show;
Result := NewRichEdit;
end
EXAMPLE - snippet of code from the NewRichEditCreate(Fname) routineAs I said, it is the TMemoryStream derived TPasConversion which does all the hard work:
<SOURCE PASCAL FILE>
|
V
Plain source loaded into memory
(TPasConversion.LoadFromFile)
|
V
Converted internally by parsing the source file
(ConvertReadStream)
|
V
Result made available
(SetMemoryPointer)
|
V
RichEdit.LoadFromStream
Most of the work in TPasConversion is done by the ConvertReadStream procedure. Its purpose is to split up each line of source code into tokens (as showed previously) and then depending on its TokenType, load it into the outbuffer preceded by RTF codes to make it a particular Color, Bold, Italics etc. Here what it looks like:
FOutBuffSize:= size+3;
ReAllocMem(FOutBuff, FOutBuffSize);
// Initialise the parser to its begining state
FTokenState := tsUnknown;
FComment := csNo;
FBuffPos := 0;
FReadBuff := Memory;
// Write leading RTF Header
WriteToBuffer('{\rtf1\ansi\deff0\deftab720{\fonttbl{\f0\fswiss MS SansSerif;}
{\f1\froman\fcharset2 Symbol;}{\f2\fmodern Courier New;}}'+#13+#10);
WriteToBuffer('{\colortbl\red0\green0\blue0;}'+#13+#10);
WriteToBuffer('\deflang1033\pard\plain\f2\fs20 ');
// Create the INSTREAM (FReadBuff) and tokenize it
Result:= Read(FReadBuff^, Size);
FReadBuff[Result] := #0;
if Result > 0 then
begin
Run:= FReadBuff;
TokenPtr:= Run;
while Run^ <> #0 do
begin
Case Run^ of
#13: // Deal with CRLFs
begin
FComment:= csNo;
HandleCRLF;
end;
#1..#9, #11, #12, #14..#32: // Deal with various whitespaces, control codes
begin
while Run^ in [#1..#9, #11, #12, #14..#32] do inc(Run);
FTokenState:= tsSpace;
TokenLen:= Run - TokenPtr;
SetString(TokenStr, TokenPtr, TokenLen);
SetRTF;
WriteToBuffer(Prefix + TokenStr + Postfix);
TokenPtr:= Run;
end;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~ much code removed ~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
end;
end
EXAMPLE - snippet showing the while loop that breaks up the INSTREAM into recognised tokensMost of the work is done by the [case Run^ in ... end;] section which "breaks off" a token from the INSTREAM (FReadBuf) based on the logic in the case statement. The case statement is organised in such a way that it can quickly decipher the input stream into the various TokenTypes by examining each character in turn. Having worked out which tokentype it is, the actual encoding part is relatively easy:
What’s happening here is the program: sets FTokenState to what we believe it is (in this part of the code it is tsSpace which matches any series of Whitespaces) the length of the token is calculated by working out how far the current memory pointer (Run) has moved since we finished with the last token (TokenPtr). the token is then copied from the Input buffer from the starting position of it in the memory buffer (TokenPtr) for the length of the token, into the variable TokenStr. ScanForRtf just checks through the resultant TokenStr to ensure it doesn't have any funny characters that the RichEdit would confuse as RTF commands. If it finds any, it escapes them out. SetRTF looks at the FTokenState to populate two global variables Prefix and Postfix with the appropriate RTF codes to give the token the right Color,Font,Boldness. WriteToBuffer than simply puts the TokenStr with the Prefix and Postfix around it into the output buffer, and the loop then continues on. Back to the topic: Syntax Highlighting (on-the-fly)
No source code is necessarily 100% applicable to your needs. I was fortunate in that most of the parser applied to my 4GL command syntax (e.g Strings were strings, Numbers were numbers, similar Keywords). As well YourPasEditor had implemented most of the basic accessory tasks such as Printing, Find, Find and Replace, Multi-File editing. It was just a matter of adding in the extras I was after.
PROBLEM #1 - No colours or fonts
One task the Parser didn't fully implement was Colors or Different Fonts, or even fonts sizes. The reason for this (after some trial and error) was that the SetRTF procedure new nothing about how to do this. It only used the information in regards [Bold], [Italics] and [Underline] stored in the Win95 Registry for the Delphi Editors Settings to determine how to highlight each token. As for fonts - well I hadn't realised that the Delphi Editor actually uses only one Font and Fontsize for all the different tokens - so that wasn't Pas2Rtf fault. I was just being greedy.
Luckily the comments in Pas2Rtt.pas told me what the other values in the Registry coded for, especially where the important foreground color was stored. This meant some changes to:
1. procedure SetDelphiRTF(S: String; aTokenState: TTokenState);
Add after the try;
Font.Color := StrToInt(Ed_List[0]);
2. procedure TPasConversion.SetPreAndPosFix
Add after FPreFix[aTokenState] = '';
FPreFixList[aTokenState] := ColorToRtf(aFont.Color);
The ColorToRtf codes is already present, but hadn't been used for some reasone. If you try it out you'll understand why :-). You get absolutely no change except lots of ';' in the wrong place.Change the ';' to '(space)' in ColorToRtf(), and you get rid of the ';' appearing in the RichEdit control, but no Colors anyway.
My first thought was that the value in Ed_List[0] didn't convert to a proper Font.Color. The easiest way to test this was to hard code Font.Color := clGreen; and see what happens. Again no luck. The format was consistent with the RTF codes I could see in the RTF header. What the $#%#$%# was wrong with it ?
It was about then that I realised I needed a crash course in RTF document structure. For this I rushed off to www.microsoft.com (please forgive me) and found a reference on RTF. After an hour of reading a Microsoft Technical Document I was even more confused. Oh well - this meant it was time to get dirty. Time to get down to real programmer stuff. Time to "cheat".
What did I do? I went into WordPad (which is just a glorified RichEdit version 2.0 on steroids) and saved various files into RTF format. I then opened them in NotePad so I could see the RTF codes and compare what happened in each case: what codes were produced depending on what I did, or didn't do. A similar sort of technique was used back in the 1980s to decipher the first Paradox database format :-) Sorry Borland.
本文地址:http://com.8s8s.com/it/it5994.htm