๐Ÿ‘จโ€๐Ÿ’ป Seungineer's GitHub Contribution

๐Ÿงญ KAIST JUNGLE/Pintos

[PintOS] User Program - Arguments parsing (Project 2, TIL)

seungineer = seungwoo + engineer 2024. 5. 6. 21:03

KAIST PintOS ๊ฐ•์˜ ๋ฐ Instruction, ํ•œ์–‘๋Œ€ PintOS Slides๋ฅผ ์ฐธ๊ณ ํ•˜๋ฉฐ ํ•™์Šตํ•œ ๋‚ด์šฉ์„ ์ •๋ฆฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.

ํ•™์Šต ๋„์ค‘ ์ž‘์„ฑํ•œ ๋‚ด์šฉ์ด๋ผ ํ‹€๋ฆฐ ๋‚ด์šฉ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


 

Arguments Parsing

Command Line์— ๋Œ€ํ•ด ๊ณต๋ฐฑ(' ') ๋‹จ์œ„๋กœ ๋ถ„๋ฆฌํ•ด์•ผ ํ•œ๋‹ค. ํ˜„์žฌ๋Š” Arguments๊ฐ€ ๋ถ„ํ• ๋˜์ง€ ์•Š์•„ ํ•˜๋‚˜์˜ ๋ฐฐ์—ด๋กœ ์ž…๋ ฅ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋ ‡๊ฒŒ ์ž…๋ ฅ๋˜๋ฉด, filename์„ ์ฐพ์„ ์ˆ˜๋„ ์—†๊ฑฐ๋‹ˆ์™€ ์—ฌ๋Ÿฌ ๋ช…๋ น ์˜ต์…˜์„ ์‚ฌ์šฉํ•  ์ˆ˜๊ฐ€ ์—†๋‹ค. ํ•€ํ† ์Šค์—์„œ๋Š” '๋ฌธ์ž์—ด ๋ถ„๋ฆฌ' ํ•จ์ˆ˜(strtok_r())๋ฅผ ์ง€์›ํ•˜๋ฏ€๋กœ ์ด๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

strtok_r()

tid_t
process_create_initd (const char *file_name) {
	char *fn_copy;
	char *temp_ptr;
	char *filename;
	tid_t tid;
	int size;
	filename = strtok_r(file_name, " ", &temp_ptr); // arguments parsing
    ...
}

์œ„ ํ•จ์ˆ˜์—์„œ ์œ ์˜ํ•ด์•ผ ํ•  ๊ฒƒ์€ ๊ตฌ๋ถ„์ž(delimiters)๋กœ ' '์ด ์•„๋‹Œ " "์ด ๋“ค์–ด๊ฐ€์•ผ ํ•œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋ถ„๋ฆฌ๋œ ๊ฐ ๋‹จ์–ด์˜ ๋์— ๊ฐœํ–‰ ๋ฌธ์ž๊ฐ€ ํฌํ•จ๋˜์–ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ฐœํ–‰๋ฌธ์ž๊ฐ€ ์žˆ์–ด์•ผ parsing ๋œ ๋‹จ์–ด๋“ค์„ ์ถ”ํ›„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

/*
Example usage:

   char s[] = "  String to  tokenize. ";
   char *token, *save_ptr;

   for (token = strtok_r (s, " ", &save_ptr); token != NULL;
   token = strtok_r (NULL, " ", &save_ptr))
   printf ("'%s'\n", token);

outputs:

'String'
'to'
'tokenize.'
*/
char *
strtok_r (char *s, const char *delimiters, char **save_ptr) {
	char *token;

	if (s == NULL)
		s = *save_ptr;

	/* Skip any DELIMITERS at our current position. */
	while (strchr (delimiters, *s) != NULL) {
		/* strchr() will always return nonnull if we're searching
		   for a null byte, because every string contains a null
		   byte (at the end). */
		if (*s == '\0') {
			*save_ptr = s;
			return NULL;
		}

		s++;
	}

	/* Skip any non-DELIMITERS up to the end of the string. */
	token = s;
	while (strchr (delimiters, *s) == NULL)
		s++;
	if (*s != '\0') {
		*s = '\0';
		*save_ptr = s + 1;
	} else
		*save_ptr = s;
	return token;
}

์œ„ ํ•จ์ˆ˜๋Š” ๋ฌธ์ž์—ด์„ ๊ตฌ๋ถ„์ž ๊ธฐ์ค€์œผ๋กœ ํŒŒ์‹ฑํ•˜์—ฌ token pointer๋กœ ์ ‘๊ทผ ๊ฐ€๋Šฅํ•จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. token pointer๊ฐ€ ๋ถ„๋ฆฌ๋œ ๊ฐ ๋‹จ์–ด ๋ฐฐ์—ด์˜ ์‹œ์ž‘์ ์ด๋‹ค.

Arguments Passing

User Process ๊ฐ€ Initialize๋  ๋•Œ, User Stack์— ํ•„์š”ํ•œ ์ธ์ž๊ฐ’๋“ค์ด Push ๋˜์–ด์•ผ ํ•œ๋‹ค. ํ•€ํ† ์Šค์—์„œ๋Š” Stack ๋ฉ”๋ชจ๋ฆฌ๋ฅผ 16์ง„์ˆ˜ ํ˜•ํƒœ๋กœ ํ™”๋ฉด์— ์ถœ๋ ฅํ•˜๊ธฐ ์œ„ํ•ด hex_dump() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

KAIST PintOS Instruction: Arguments Passing

stack์˜ ๊ตฌ์„ฑ์€ ์œ„์™€ ๊ฐ™์œผ๋ฉฐ ์ด ํ˜•ํƒœ์— ๋งž๊ฒŒ ๊ตฌํ˜„ํ•˜๋ฉด ๋œ๋‹ค.

# ์ ‘๊ทผ

stack top pointer(_if.rsp) ๋ณ€์ˆ˜๋ฅผ arguments vector์˜ ๊ธธ์ด์— ๋งž์ถฐ์„œ ์—ฐ์žฅ์‹œ์ผœ์•ผ ํ•œ๋‹ค(= Stack ํ™•์žฅ). ์—ฐ์žฅ์‹œํ‚ค๋ฉด์„œ memcpy ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ์—ฐ์žฅํ•œ ์˜์—ญ์— argv[i]๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ์ˆœ์„œ๋Œ€๋กœ ๊ตฌํ˜„ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

	for (int i = strlen(argv)-1; i >= 0;i-- ){
		_if.rsp = _if.rsp - strlen(argv[i]+1); 	     // ์Šคํƒ ํ™•์žฅ
		memcpy(_if.rsp, argv[i], strlen(argv[i])+1); // ํ™•์žฅํ•œ ๊ณต๊ฐ„์— argv ๋ณต์‚ฌ
		argv_ptr[i] = _if.rsp;                       // rsp ๊ฐ€ ๊ฐ€๋ฆฌํ‚จ argv pointer ์ €์žฅ
	}
๋”๋ณด๊ธฐ

memcpy()

2๏ธโƒฃ argv[i] ์ฃผ์†Œ์—์„œ, 3๏ธโƒฃ strlen(argv[i])+1 ๊ธธ์ด ๋งŒํผ, 1๏ธโƒฃ _if.rsp ์ฃผ์†Œ์— ๋ณต์‚ฌํ•จ

๐Ÿ’ก ์„ธ ๋ฒˆ์งธ ์ธ์ž๋กœ sizeof(argv[i])๊ฐ€ ๋“ค์–ด๊ฐ€๋Š” ๊ฒŒ ์•„๋‹˜์— ์ฃผ์˜

์ดํ›„ stack top pinter(_if.rsp)์˜ ์œ„์น˜๊ฐ€ 8์˜ ๋ฐฐ์ˆ˜(x86-64 ํ™˜๊ฒฝ์ด๋ฏ€๋กœ)๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด stack top pointer๋ฅผ ํ™•์žฅํ•ด๋‚˜๊ฐ€๋ฉฐ 8์˜ ๋ฐฐ์ˆ˜๊ฐ€ ๋  ์ˆ˜ ์žˆ๋„๋ก ํŒจ๋”ฉ ์ฒ˜๋ฆฌ๋ฅผ ํ•œ๋‹ค. ํŒจ๋”ฉ์—๋Š” 0์ด ํ• ๋‹น๋œ๋‹ค(๊ทธ๋ฆผ์˜ 0x4747ffe8์˜ data ๋ถ€๋ถ„).

User stack์—๋Š” arguments vector์˜ ์ฒซ ์ฃผ์†Œ๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” argv_ptr ์˜์—ญ๋„ ํ•„์š”ํ•˜๋‹ค(์ž๋ฃŒ์˜ argv[i]์— ํ•ด๋‹น). ์ด๋•Œ ๋ฐ˜๋ณต๋˜๋Š” ํšŸ์ˆ˜์— ์œ ์˜ํ•ด์•ผ ํ•œ๋‹ค(^^.. Debug์—์„œ ํ›„์ˆ ).

	for (int i = temp_cnt-1; i >= 0;i-- ){
		_if.rsp = _if.rsp - strlen(argv[i])-1; // ์Šคํƒ ํ™•์žฅ
		memcpy(_if.rsp, argv[i], strlen(argv[i])+1); // ํ™•์žฅํ•œ ๊ณต๊ฐ„์— argv ๋ณต์‚ฌ
		argv_ptr[i] = _if.rsp; // rsp ๊ฐ€ ๊ฐ€๋ฆฌํ‚จ argv pointer ์ €์žฅ
	}

    mod = (KERN_BASE - _if.rsp) % 8; // rsp ์œ„์น˜์™€ user stack pointer ์œ„์น˜ ์ฐจ์ด๊ฐ€ 8์˜ ๋ฐฐ์ˆ˜์ธ์ง€
    while(mod != 0){                 // 8์˜ ๋ฐฐ์ˆ˜๊ฐ€ ์•„๋‹ˆ๋ผ ๋‚˜๋จธ์ง€๊ฐ€ ์žˆ๋‹ค๋ฉด ๋‚˜๋จธ์ง€ ๋งŒํผ ์Šคํƒ ํ™•์žฅ
        _if.rsp --;
        *(uint8_t *) _if.rsp = 0;
        mod --;
        _if.rsp --;
    }

์ถ”๊ฐ€๋กœ user stack์˜ ํ˜•ํƒœ๋ฅผ ํ•ญ์ƒ ์ผ์ •ํ•˜๊ฒŒ ๋งž์ถ”์–ด ๋‹ค๋ฃจ๊ธฐ ์šฉ์ดํ•˜๋„๋ก ํ•˜๋Š” ์ž‘์—…์ด ํ•„์š”ํ•˜๋‹ค(Instruction ๊ทธ๋ฆผ ์ฐธ๊ณ ). ๋˜ํ•œ, ์ฐธ๊ณ  stack ๊ทธ๋ฆผ์˜ argv[0]~[4] ์ฒ˜๋Ÿผ argv data๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” ์ฃผ์†Œ๋ฅผ ํ• ๋‹นํ•ด์•ผ ํ•˜๋ฏ€๋กœ ์Šคํƒ์„ ํ™•์žฅํ•˜๋ฉฐ argument vector์˜ ์ฃผ์†Œ๋ฅผ ๋ณต์‚ฌํ•ด์•ผํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ Fake Address๋ฅผ ์œ„ํ•œ 8 byte๋ฅผ ํ™•์žฅํ•˜๋ฉด user stack ํ• ๋‹น์€ ๋์ด ๋‚œ๋‹ค. 

    _if.rsp -= 8;                     // ํ˜•์‹ ๋งž์ถ”๊ธฐ ์œ„ํ•œ ์˜๋„์  ํŒจ๋”ฉ(8 byte)
    memset(_if.rsp, 0, 8);

    for (int i = temp_cnt - 1; i >= 0; i--)
    {
        _if.rsp -= 8;                 // 8 byte ๋งŒํผ stack ํ™•์žฅ
        memcpy(_if.rsp, &argv[i], 8); // argument vector์˜ ์ฃผ์†Œ ๋ณต์‚ฌ
    }

    _if.rsp -= 8;                     // ํ˜•์‹ ๋งž์ถ”๊ธฐ ์œ„ํ•œ Fake Address(8 byte)
    memset(_if.rsp, 0, 8);

    _if.R.rdi = temp_cnt;            // rdi ๋ ˆ์ง€์Šคํ„ฐ์— arguments count ํ• ๋‹น
    _if.R.rsi = (char *)_if.rsp + 8; // rsi ๋ ˆ์ง€์Šคํ„ฐ์— fake address๋ฅผ ์ œ์™ธํ•œ user stack์˜ ์ฃผ์†Œ ํ• ๋‹น

 

๐Ÿ›Debug

hex_dump() ํ•จ์ˆ˜ ์‹คํ–‰ ์‹œ ๋ฉ”๋ชจ๋ฆฌ์— ๋‹ด๊ธด ๋‚ด์šฉ์ด ์—†๋Š” Case

hex_dump() ์ด์ƒ ๊ฒฐ๊ณผ(์ขŒ) Debug ํ›„ ๊ฒฐ๊ณผ(์šฐ)

์ด๋•Œ๊นŒ์ง€ parsingํ•˜๊ณ , stack ์— push ํ•œ ๋‚ด์šฉ์ด ์•ˆ ๋ณด์ธ๋‹ค. ๋‘ ๊ฐ€์ง€ ๊ฒฝ์šฐ๊ฐ€ ์˜์‹ฌ๋œ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋Š” hex_dump์—์„œ ์‹œ์ž‘๋˜๋Š” ์ฃผ์†Œ๊ฐ€ stack top pointer๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ, ๋‘ ๋ฒˆ์งธ๋Š” stack์— ์ž˜ push ๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ์ด๋‹ค. ์ฒซ ๋ฒˆ์งธ case ํ™•์ธ์„ ์œ„ํ•ด hex_dump() ํ•จ์ˆ˜๋ฅผ ํ™•์ธํ•˜์˜€๋‹ค. hex_dump() ํ•จ์ˆ˜์˜ ์ธ์ž๋กœ _if.rsp(stack top pointer)๊ฐ€ ์ž…๋ ฅ๋˜๋Š”๋ฐ, ํ•จ์ˆ˜ ๋‚ด์—์„œ _if.rsp ์ฃผ์†Œ ๊นŒ์ง€ ์ถœ๋ ฅํ•˜๊ธฐ ์œ„ํ•จ์œผ๋กœ ํ™œ์šฉ๋˜์–ด ์ด์ƒ์ด ์—†์—ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ case๋กœ ์ธํ•œ ์—๋Ÿฌ๋กœ ๋ณด์˜€์œผ๋ฉฐ, stack์„ pushํ•˜๋Š” ๋ฐ˜๋ณต๋ฌธ์—์„œ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

for (int i = temp_cnt-1; i >= 0;i-- ){
		_if.rsp = _if.rsp - strlen(argv[i])-1;       // ์Šคํƒ ํ™•์žฅ
		memcpy(_if.rsp, argv[i], strlen(argv[i])+1); // ํ™•์žฅํ•œ ๊ณต๊ฐ„์— argv ๋ณต์‚ฌ
		argv_ptr[i] = _if.rsp;                       // rsp ๊ฐ€ ๊ฐ€๋ฆฌํ‚จ argv pointer ์ €์žฅ
	}

๋ฐ˜๋ณต๋ฌธ์„ ๋Œ๋ฉด ๋Œ์ˆ˜๋ก ์Šคํƒ์ด ํ™•์žฅ(_if.rsp ๊ฐ์†Œ)๋œ๋‹ค. ์ฆ‰, ๋ฐ˜๋ณต ํ•ด์ฃผ๋Š” ํšŸ์ˆ˜๊ฐ€ ์ค‘์š”ํ•œ๋ฐ argv๋ฅผ ๋ณต์‚ฌํ•  ๋งŒํผ๋งŒ ๋ฐ˜๋ณตํ•ด์•ผํ•˜๋Š” ๊ฑธ(temp_cnt ํšŸ์ˆ˜) argc -1๋งŒํผ ๋ฐ˜๋ณตํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์ฐฉ๊ฐํ•˜์—ฌ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์ด์—ˆ๋‹ค ๐Ÿฅฒ ๊ทธ๋ž˜๋„ ๋ญ ์ž˜ ํ•ด๊ฒฐํ–ˆ๋‹ค~ ฦช( ห˜ โŒฃห˜ )สƒ 


+ ์–ด์ œ ์ฝ์€ ์žฌ๋ฐŒ๋Š” ํ…Œํฌ ์•„ํ‹ฐํด

ํ˜„๋Œ€์˜ Hello World ํ”„๋กœ๊ทธ๋žจ ๋’ค์— ์ˆจ๊ฒจ์ง„ ์ถ”์ƒํ™”์˜ ์„ธ๊ณ„ ํƒํ—˜