Fixed-point in C/C++

2016. 4. 5. 21:03

fixed-point in C

original material: artist-embedded.org/EmbeddedControl Slides
reference 1: fixedpt.html
reference 2: Q_format

1. Fixed-point Representation

x: real number
X: fixed-point number
N: wordlength
m: integer (excluding sign bit)
f: number of fraction bit
“Q-format” : Qm.n

0/1101/011
sign bit/4bit integer/3bit fraction

2. Conversion to and from fixed-point

real to fixed
- Multiply the floating point number by 2^f
- Round to the nearest integer
  $X = r o u n d (x ˙ 2 f)$ $X = round(x \dot{} 2^f)$
fixed to real

$x = X ˙ 2 - f$ $x = X \dot{} 2^{-f}$

example) 13.4 to Q4.3 format

X=round(13.4˙23)=107(=011010112)

$X = round(13.4 \dot{} 2^3) = 107 (=01101011_2)$

3. Range of fixed-point representation

negative number: 2’s complement
- N=8, 2^(-8) ~ 2^(8-1)
  
  binary representation decimal
  
  00000000 0
  
  00000001 1
  
  00000010 2
  
  … …
  
  01111111 127
  
  10000000 -128
  
  10000001 -127
  
  … …
  
  11111111 -1
range of Qm.f [ref]

$[- 2 m, 2 m - 2 f]$ $[-2^m , 2^m - 2^f]$

binary representation	decimal
00000000	0
00000001	1
00000010	2
…	…
01111111	127
10000000	-128
10000001	-127
…	…
11111111	-1

4. Arithmetic operations of fixed-point

Satuation check

int16_t sat16(int32_t x)
{
    if (x > 0x7FFF) return 0x7FFF;
    else if (x < 0x8000) return 0x8000;
    else return (int16_t)x;
}

Addition

int16_t q_add_sat(int16_t a, int16_t b)
{
    int16_t result;
    int32_t tmp;

    tmp = (int32_t)a + (int32_t)b;
    if (tmp > 0x7FFF)
        tmp = 0x7FFF;
    if (tmp < -1 * 0x8000)
        tmp = -1 * 0x8000;
    result = (int16_t)tmp;

    return result;
}

Subtraction

int16_t q_sub(int16_t a, int16_t b)
{
    int16_t result;
    result = a - b;
    return result;
}

Multiplication

// precomputed value:
#define K   (1 << (f - 1))    // f: fraction of fixed-point

int16_t q_mul(int16_t a, int16_t b)
{
    int16_t result;
    int32_t temp;

    temp = (int32_t)a * (int32_t)b; // result type is operand's type
    // Rounding; mid values are rounded up
    temp += K;
    // Correct by dividing by base and saturate result
    result = sat16(temp >> Q);

    return result;
}

Division

int16_t q_div(int16_t a, int16_t b)
{
    int16_t result;
    int32_t temp;

    // pre-multiply by the base (Upscale to Q16 so that the result will be in Q8 format)
    temp = (int32_t)a << Q;
    // Rounding: mid values are rounded up (down for negative values).
    if ((temp >= 0 && b >= 0) || (temp < 0 && b < 0))
        temp += b / 2;
    else
        temp -= b / 2;
    result = (int16_t)(temp / b);

    return result;
}

저작자표시 비영리

'C and C++' 카테고리의 다른 글

Fixed Point Prototype (0)	2016.04.09
main function arguments (0)	2016.03.11
OpenMP in Macbook (0)	2016.03.10
OpenMP (0)	2016.03.09
C언어 최적화 기법 (0)	2016.02.05

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Leokohc 코딩 걸음마

Fixed-point in C/C++

fixed-point in C

1. Fixed-point Representation

2. Conversion to and from fixed-point

3. Range of fixed-point representation

4. Arithmetic operations of fixed-point

'C and C++' 카테고리의 다른 글

+ Recent posts

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역