CGDoom/cgdoom/os.h

52 lines
1.5 KiB
C
Raw Normal View History

2019-04-04 07:11:35 +02:00
#include "platform.h"
#define SYSTEM_STACK_SAFE (64*1024)
#define USER_STACK_SAFE (32*1024)
2019-04-04 07:11:35 +02:00
// SaveVRAMBuffer 165888 bytes
2019-04-04 07:11:35 +02:00
#define SAVE_VRAM_SIZE (WIDTH*HEIGHT*2-3)
extern unsigned char *SaveVRAMBuffer;
2019-04-04 07:11:35 +02:00
// system stack (512 kB).
2019-04-04 07:11:35 +02:00
#define SYSTEM_STACK_SIZE (512*1024-SYSTEM_STACK_SAFE)
extern unsigned char *SystemStack;
2015-04-15 02:16:51 +02:00
extern unsigned short *VRAM;
void * CGDMalloc(int iSize);
void * CGDCalloc(int iSize);
void * CGDRealloc (void *p, int iSize);
void D_DoomMain();
void CGDAppendNum09(const char *pszText,int iNum,char *pszBuf);
int CGDstrlen(const char *pszText);
void CGDstrcpy(char *pszBuf,const char *pszText);
int CGDstrcmp (const char*s1,const char*s2);
int CGDstrncmp (const char*s1,const char*s2,int iLen);
int CGDstrnicmp (const char*s1,const char*s2,int iLen);
void CGDstrncpy(char *pszBuf,const char *pszText,int iLen);
void CGDAppendNum0_999(const char *pszText,int iNum,int iMinDigits,char *pszBuf);
2021-07-27 11:34:35 +02:00
void CGDAppendHex32(const char *pszText,int iNum, int iDigits,char *pszBuf);
2015-04-15 02:16:51 +02:00
int abs(int x);
2021-08-14 11:55:20 +02:00
void I_Error (const char *error, ...);
2021-07-27 11:34:35 +02:00
void I_ErrorI(const char *str, int i1, int i2, int i3, int i4);
2015-04-15 02:16:51 +02:00
//force compiler error on use following:
#define strcpy 12
#define strnicmp 22
#define strncmp 27
#define strcmp 33
//return ptr to flash
Optimize loading speed (x2.7) and game speed (+35%) Loading is measured by RTC_GetTicks(). * Initial version: 9.8s This was a regression due to using 512-byte sectors instead of 4 kiB clusters as previously. * Do BFile reads of 4 kiB: 5.2s (-47%) Feels similar to original code, I'll take this as my baseline. * Test second half of Flash first: 3.6s (-31%) By reading from FLASH_FS_HINT to FLASH_END first many OS sectors can be skipped (without missing on other sectors just in case). * Load to XRAM instead or RAM with BFile The DMA is 10% slower to XRAM than to RAM, but this benefits memcmp() because of faster memory accesses through the operand bus. No effect at this point, but ends up saving 8% after memcmp is optimized. * Optimize memcmp for sectors: 3376 ms (-8%) The optimized memcmp uses word accesses for ROM (which is fastest), and weaves loop iterations to exploit superscalar parallelism. * Search sectors most likely to contain data first: 2744 ms (-19%) File fragments almost always start on 4-kiB boundaries between FLASH_FS_HINT and FLASH_END, so these are tested first. * Index most likely sectors, improve FLASH_FS_HINT: 2096 ms (-24%) Most likely sectors are indexed by first 4 bytes and binary searched, and a slightly larger region is considered for hints. The cache hits 119/129 fragments in my case. * Use optimized memcmp for consecutive fragments: 1408 ms (-33%) I only set it for the search of the first sector in each fragment and forgot to use it where it is really needed. x) Game speed is measured roughly by the time it takes to hit a wall by walking straight after spawning in Hangar. * Initial value: 4.4s * Use cached ROM when loading data from the WAD: 2.9s (-35%) Cached accesses are quite detrimental for sector search, I assume because everything is aligned like crazy, but it's still a major help when reading sequential data in real-time.
2021-07-28 22:51:03 +02:00
int FindInFlash(const void **buf, int size, int readpos);
2015-04-15 02:16:51 +02:00
//direct read from flash
int Flash_ReadFile(void *buf, int size, int readpos);
Optimize loading speed (x2.7) and game speed (+35%) Loading is measured by RTC_GetTicks(). * Initial version: 9.8s This was a regression due to using 512-byte sectors instead of 4 kiB clusters as previously. * Do BFile reads of 4 kiB: 5.2s (-47%) Feels similar to original code, I'll take this as my baseline. * Test second half of Flash first: 3.6s (-31%) By reading from FLASH_FS_HINT to FLASH_END first many OS sectors can be skipped (without missing on other sectors just in case). * Load to XRAM instead or RAM with BFile The DMA is 10% slower to XRAM than to RAM, but this benefits memcmp() because of faster memory accesses through the operand bus. No effect at this point, but ends up saving 8% after memcmp is optimized. * Optimize memcmp for sectors: 3376 ms (-8%) The optimized memcmp uses word accesses for ROM (which is fastest), and weaves loop iterations to exploit superscalar parallelism. * Search sectors most likely to contain data first: 2744 ms (-19%) File fragments almost always start on 4-kiB boundaries between FLASH_FS_HINT and FLASH_END, so these are tested first. * Index most likely sectors, improve FLASH_FS_HINT: 2096 ms (-24%) Most likely sectors are indexed by first 4 bytes and binary searched, and a slightly larger region is considered for hints. The cache hits 119/129 fragments in my case. * Use optimized memcmp for consecutive fragments: 1408 ms (-33%) I only set it for the search of the first sector in each fragment and forgot to use it where it is really needed. x) Game speed is measured roughly by the time it takes to hit a wall by walking straight after spawning in Hangar. * Initial value: 4.4s * Use cached ROM when loading data from the WAD: 2.9s (-35%) Cached accesses are quite detrimental for sector search, I assume because everything is aligned like crazy, but it's still a major help when reading sequential data in real-time.
2021-07-28 22:51:03 +02:00
#define min(x,y) ({ \
__auto_type __x = (x); \
__auto_type __y = (y); \
__x < __y ? __x : __y; \
})